Computational workflow for the fine-grained analysis of metagenomic samples

Abstract Background The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and...

Full description

Bibliographic Details
Main Authors: Esteban Pérez-Wohlfeil, Jose A. Arjona-Medina, Oscar Torreno, Eugenia Ulzurrun, Oswaldo Trelles
Format: Article
Language:English
Published: BMC 2016-10-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-016-3063-x
id doaj-ef466d15d0c14365a08b3710c5854ef7
record_format Article
spelling doaj-ef466d15d0c14365a08b3710c5854ef72020-11-25T01:00:59ZengBMCBMC Genomics1471-21642016-10-0117S835136110.1186/s12864-016-3063-xComputational workflow for the fine-grained analysis of metagenomic samplesEsteban Pérez-Wohlfeil0Jose A. Arjona-Medina1Oscar Torreno2Eugenia Ulzurrun3Oswaldo Trelles4Department of Computer Architecture, University of MálagaAdvanced Computing Technologies Unit, RISC Software GmbHDepartment of Computer Architecture, University of MálagaDepartment of Computer Architecture, University of MálagaDepartment of Computer Architecture, University of MálagaAbstract Background The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and identify changes in the abundance of species under different conditions. Current metagenomic analysis software faces bottlenecks due to the high computational load required to analyze complex samples. Results A computational open-source workflow has been developed for the detailed analysis of metagenomes. This workflow provides new tools and datafile specifications that facilitate the identification of differences in abundance of reads assigned to taxa (mapping), enables the detection of reads of low-abundance bacteria (producing evidence of their presence), provides new concepts for filtering spurious matches, etc. Innovative visualization ideas for improved display of metagenomic diversity are also proposed to better understand how reads are mapped to taxa. Illustrative examples are provided based on the study of two collections of metagenomes from faecal microbial communities of adult female monozygotic and dizygotic twin pairs concordant for leanness or obesity and their mothers. Conclusions The proposed workflow provides an open environment that offers the opportunity to perform the mapping process using different reference databases. Additionally, this workflow shows the specifications of the mapping process and datafile formats to facilitate the development of new plugins for further post-processing. This open and extensible platform has been designed with the aim of enabling in-depth analysis of metagenomic samples and better understanding of the underlying biological processes.http://link.springer.com/article/10.1186/s12864-016-3063-xMetagenome analysisDifferential abundanceAnnotational mappingMapping over specific regionsOpen platform
collection DOAJ
language English
format Article
sources DOAJ
author Esteban Pérez-Wohlfeil
Jose A. Arjona-Medina
Oscar Torreno
Eugenia Ulzurrun
Oswaldo Trelles
spellingShingle Esteban Pérez-Wohlfeil
Jose A. Arjona-Medina
Oscar Torreno
Eugenia Ulzurrun
Oswaldo Trelles
Computational workflow for the fine-grained analysis of metagenomic samples
BMC Genomics
Metagenome analysis
Differential abundance
Annotational mapping
Mapping over specific regions
Open platform
author_facet Esteban Pérez-Wohlfeil
Jose A. Arjona-Medina
Oscar Torreno
Eugenia Ulzurrun
Oswaldo Trelles
author_sort Esteban Pérez-Wohlfeil
title Computational workflow for the fine-grained analysis of metagenomic samples
title_short Computational workflow for the fine-grained analysis of metagenomic samples
title_full Computational workflow for the fine-grained analysis of metagenomic samples
title_fullStr Computational workflow for the fine-grained analysis of metagenomic samples
title_full_unstemmed Computational workflow for the fine-grained analysis of metagenomic samples
title_sort computational workflow for the fine-grained analysis of metagenomic samples
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2016-10-01
description Abstract Background The field of metagenomics, defined as the direct genetic analysis of uncultured samples of genomes contained within an environmental sample, is gaining increasing popularity. The aim of studies of metagenomics is to determine the species present in an environmental community and identify changes in the abundance of species under different conditions. Current metagenomic analysis software faces bottlenecks due to the high computational load required to analyze complex samples. Results A computational open-source workflow has been developed for the detailed analysis of metagenomes. This workflow provides new tools and datafile specifications that facilitate the identification of differences in abundance of reads assigned to taxa (mapping), enables the detection of reads of low-abundance bacteria (producing evidence of their presence), provides new concepts for filtering spurious matches, etc. Innovative visualization ideas for improved display of metagenomic diversity are also proposed to better understand how reads are mapped to taxa. Illustrative examples are provided based on the study of two collections of metagenomes from faecal microbial communities of adult female monozygotic and dizygotic twin pairs concordant for leanness or obesity and their mothers. Conclusions The proposed workflow provides an open environment that offers the opportunity to perform the mapping process using different reference databases. Additionally, this workflow shows the specifications of the mapping process and datafile formats to facilitate the development of new plugins for further post-processing. This open and extensible platform has been designed with the aim of enabling in-depth analysis of metagenomic samples and better understanding of the underlying biological processes.
topic Metagenome analysis
Differential abundance
Annotational mapping
Mapping over specific regions
Open platform
url http://link.springer.com/article/10.1186/s12864-016-3063-x
work_keys_str_mv AT estebanperezwohlfeil computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
AT joseaarjonamedina computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
AT oscartorreno computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
AT eugeniaulzurrun computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
AT oswaldotrelles computationalworkflowforthefinegrainedanalysisofmetagenomicsamples
_version_ 1725211461264867328