A Review of Scalable Bioinformatics Pipelines
Abstract Scalability is increasingly important for bioinformatics analysis services, since these must handle larger datasets, more jobs, and more users. The pipelines used to implement analyses must therefore scale with respect to the resources on a single compute node, the number of nodes on a clus...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2017-10-01
|
Series: | Data Science and Engineering |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1007/s41019-017-0047-z |
Summary: | Abstract Scalability is increasingly important for bioinformatics analysis services, since these must handle larger datasets, more jobs, and more users. The pipelines used to implement analyses must therefore scale with respect to the resources on a single compute node, the number of nodes on a cluster, and also to cost-performance. Here, we survey several scalable bioinformatics pipelines and compare their design and their use of underlying frameworks and infrastructures. We also discuss current trends for bioinformatics pipeline development. |
---|---|
ISSN: | 2364-1185 2364-1541 |