Watchdog – a workflow management system for the distributed analysis of large-scale experimental data

Abstract Background The development of high-throughput experimental technologies, such as next-generation sequencing, have led to new challenges for handling, analyzing and integrating the resulting large and diverse datasets. Bioinformatical analysis of these data commonly requires a number of mutu...

Full description

Bibliographic Details
Main Authors: Michael Kluge, Caroline C. Friedel
Format: Article
Language:English
Published: BMC 2018-03-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-018-2107-4
id doaj-92f07f7190d14ce2ade9b482a3a25c4a
record_format Article
spelling doaj-92f07f7190d14ce2ade9b482a3a25c4a2020-11-25T02:10:27ZengBMCBMC Bioinformatics1471-21052018-03-0119111310.1186/s12859-018-2107-4Watchdog – a workflow management system for the distributed analysis of large-scale experimental dataMichael Kluge0Caroline C. Friedel1Institute for Informatics, Ludwig-Maximilians-Universität MünchenInstitute for Informatics, Ludwig-Maximilians-Universität MünchenAbstract Background The development of high-throughput experimental technologies, such as next-generation sequencing, have led to new challenges for handling, analyzing and integrating the resulting large and diverse datasets. Bioinformatical analysis of these data commonly requires a number of mutually dependent steps applied to numerous samples for multiple conditions and replicates. To support these analyses, a number of workflow management systems (WMSs) have been developed to allow automated execution of corresponding analysis workflows. Major advantages of WMSs are the easy reproducibility of results as well as the reusability of workflows or their components. Results In this article, we present Watchdog, a WMS for the automated analysis of large-scale experimental data. Main features include straightforward processing of replicate data, support for distributed computer systems, customizable error detection and manual intervention into workflow execution. Watchdog is implemented in Java and thus platform-independent and allows easy sharing of workflows and corresponding program modules. It provides a graphical user interface (GUI) for workflow construction using pre-defined modules as well as a helper script for creating new module definitions. Execution of workflows is possible using either the GUI or a command-line interface and a web-interface is provided for monitoring the execution status and intervening in case of errors. To illustrate its potentials on a real-life example, a comprehensive workflow and modules for the analysis of RNA-seq experiments were implemented and are provided with the software in addition to simple test examples. Conclusions Watchdog is a powerful and flexible WMS for the analysis of large-scale high-throughput experiments. We believe it will greatly benefit both users with and without programming skills who want to develop and apply bioinformatical workflows with reasonable overhead. The software, example workflows and a comprehensive documentation are freely available at www.bio.ifi.lmu.de/watchdog.http://link.springer.com/article/10.1186/s12859-018-2107-4Workflow management systemHigh-throughput experimentsLarge-scale datasetsAutomated executionDistributed analysisReusability
collection DOAJ
language English
format Article
sources DOAJ
author Michael Kluge
Caroline C. Friedel
spellingShingle Michael Kluge
Caroline C. Friedel
Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
BMC Bioinformatics
Workflow management system
High-throughput experiments
Large-scale datasets
Automated execution
Distributed analysis
Reusability
author_facet Michael Kluge
Caroline C. Friedel
author_sort Michael Kluge
title Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_short Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_full Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_fullStr Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_full_unstemmed Watchdog – a workflow management system for the distributed analysis of large-scale experimental data
title_sort watchdog – a workflow management system for the distributed analysis of large-scale experimental data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2018-03-01
description Abstract Background The development of high-throughput experimental technologies, such as next-generation sequencing, have led to new challenges for handling, analyzing and integrating the resulting large and diverse datasets. Bioinformatical analysis of these data commonly requires a number of mutually dependent steps applied to numerous samples for multiple conditions and replicates. To support these analyses, a number of workflow management systems (WMSs) have been developed to allow automated execution of corresponding analysis workflows. Major advantages of WMSs are the easy reproducibility of results as well as the reusability of workflows or their components. Results In this article, we present Watchdog, a WMS for the automated analysis of large-scale experimental data. Main features include straightforward processing of replicate data, support for distributed computer systems, customizable error detection and manual intervention into workflow execution. Watchdog is implemented in Java and thus platform-independent and allows easy sharing of workflows and corresponding program modules. It provides a graphical user interface (GUI) for workflow construction using pre-defined modules as well as a helper script for creating new module definitions. Execution of workflows is possible using either the GUI or a command-line interface and a web-interface is provided for monitoring the execution status and intervening in case of errors. To illustrate its potentials on a real-life example, a comprehensive workflow and modules for the analysis of RNA-seq experiments were implemented and are provided with the software in addition to simple test examples. Conclusions Watchdog is a powerful and flexible WMS for the analysis of large-scale high-throughput experiments. We believe it will greatly benefit both users with and without programming skills who want to develop and apply bioinformatical workflows with reasonable overhead. The software, example workflows and a comprehensive documentation are freely available at www.bio.ifi.lmu.de/watchdog.
topic Workflow management system
High-throughput experiments
Large-scale datasets
Automated execution
Distributed analysis
Reusability
url http://link.springer.com/article/10.1186/s12859-018-2107-4
work_keys_str_mv AT michaelkluge watchdogaworkflowmanagementsystemforthedistributedanalysisoflargescaleexperimentaldata
AT carolinecfriedel watchdogaworkflowmanagementsystemforthedistributedanalysisoflargescaleexperimentaldata
_version_ 1724919793519165440