MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]

Metagenomic sequencing is an increasingly common tool in environmental and biomedical sciences.  While software for detailing the composition of microbial communities using 16S rRNA marker genes is relatively mature, increasingly researchers are interested in identifying changes exhibited within mic...

Full description

Bibliographic Details
Main Authors: Mike W.C. Thang, Xin-Yi Chua, Gareth Price, Dominique Gorse, Matt A. Field
Format: Article
Language:English
Published: F1000 Research Ltd 2019-10-01
Series:F1000Research
Online Access:https://f1000research.com/articles/8-726/v2
id doaj-5809d5361ec3462a81ce48171dcba3f6
record_format Article
spelling doaj-5809d5361ec3462a81ce48171dcba3f62020-11-25T03:08:09ZengF1000 Research LtdF1000Research2046-14022019-10-01810.12688/f1000research.18866.223147MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]Mike W.C. Thang0Xin-Yi Chua1Gareth Price2Dominique Gorse3Matt A. Field4Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, 4000, AustraliaInstitute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, 4000, AustraliaQueensland Facility for Advanced Bioinformatics, University of Queensland, Brisbane, Queensland, 4000, AustraliaInstitute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, 4000, AustraliaJohn Curtin School of Medical Research, Australian National University, Canberra, ACT, AustraliaMetagenomic sequencing is an increasingly common tool in environmental and biomedical sciences.  While software for detailing the composition of microbial communities using 16S rRNA marker genes is relatively mature, increasingly researchers are interested in identifying changes exhibited within microbial communities under differing environmental conditions. In order to gain maximum value from metagenomic sequence data we must improve the existing analysis environment by providing accessible and scalable computational workflows able to generate reproducible results. Here we describe a complete end-to-end open-source metagenomics workflow running within Galaxy for 16S differential abundance analysis. The workflow accepts 454 or Illumina sequence data (either overlapping or non-overlapping paired end reads) and outputs lists of the operational taxonomic unit (OTUs) exhibiting the greatest change under differing conditions. A range of analysis steps and graphing options are available giving users a high-level of control over their data and analyses. Additionally, users are able to input complex sample-specific metadata information which can be incorporated into differential analysis and used for grouping / colouring within graphs.  Detailed tutorials containing sample data and existing workflows are available for three different input types: overlapping and non-overlapping read pairs as well as for pre-generated Biological Observation Matrix (BIOM) files. Using the Galaxy platform we developed MetaDEGalaxy, a complete metagenomics differential abundance analysis workflow. MetaDEGalaxy is designed for bench scientists working with 16S data who are interested in comparative metagenomics.  MetaDEGalaxy builds on momentum within the wider Galaxy metagenomics community with the hope that more tools will be added as existing methods mature.https://f1000research.com/articles/8-726/v2
collection DOAJ
language English
format Article
sources DOAJ
author Mike W.C. Thang
Xin-Yi Chua
Gareth Price
Dominique Gorse
Matt A. Field
spellingShingle Mike W.C. Thang
Xin-Yi Chua
Gareth Price
Dominique Gorse
Matt A. Field
MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]
F1000Research
author_facet Mike W.C. Thang
Xin-Yi Chua
Gareth Price
Dominique Gorse
Matt A. Field
author_sort Mike W.C. Thang
title MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]
title_short MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]
title_full MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]
title_fullStr MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]
title_full_unstemmed MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]
title_sort metadegalaxy: galaxy workflow for differential abundance analysis of 16s metagenomic data [version 2; peer review: 2 approved]
publisher F1000 Research Ltd
series F1000Research
issn 2046-1402
publishDate 2019-10-01
description Metagenomic sequencing is an increasingly common tool in environmental and biomedical sciences.  While software for detailing the composition of microbial communities using 16S rRNA marker genes is relatively mature, increasingly researchers are interested in identifying changes exhibited within microbial communities under differing environmental conditions. In order to gain maximum value from metagenomic sequence data we must improve the existing analysis environment by providing accessible and scalable computational workflows able to generate reproducible results. Here we describe a complete end-to-end open-source metagenomics workflow running within Galaxy for 16S differential abundance analysis. The workflow accepts 454 or Illumina sequence data (either overlapping or non-overlapping paired end reads) and outputs lists of the operational taxonomic unit (OTUs) exhibiting the greatest change under differing conditions. A range of analysis steps and graphing options are available giving users a high-level of control over their data and analyses. Additionally, users are able to input complex sample-specific metadata information which can be incorporated into differential analysis and used for grouping / colouring within graphs.  Detailed tutorials containing sample data and existing workflows are available for three different input types: overlapping and non-overlapping read pairs as well as for pre-generated Biological Observation Matrix (BIOM) files. Using the Galaxy platform we developed MetaDEGalaxy, a complete metagenomics differential abundance analysis workflow. MetaDEGalaxy is designed for bench scientists working with 16S data who are interested in comparative metagenomics.  MetaDEGalaxy builds on momentum within the wider Galaxy metagenomics community with the hope that more tools will be added as existing methods mature.
url https://f1000research.com/articles/8-726/v2
work_keys_str_mv AT mikewcthang metadegalaxygalaxyworkflowfordifferentialabundanceanalysisof16smetagenomicdataversion2peerreview2approved
AT xinyichua metadegalaxygalaxyworkflowfordifferentialabundanceanalysisof16smetagenomicdataversion2peerreview2approved
AT garethprice metadegalaxygalaxyworkflowfordifferentialabundanceanalysisof16smetagenomicdataversion2peerreview2approved
AT dominiquegorse metadegalaxygalaxyworkflowfordifferentialabundanceanalysisof16smetagenomicdataversion2peerreview2approved
AT mattafield metadegalaxygalaxyworkflowfordifferentialabundanceanalysisof16smetagenomicdataversion2peerreview2approved
_version_ 1724667363214753792