Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN

The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic...

Full description

Bibliographic Details
Main Authors: Marc Gregory Dumont, Claudia eLüke, Yongcui eDeng, Peter eFrenzel
Format: Article
Language:English
Published: Frontiers Media S.A. 2014-02-01
Series:Frontiers in Microbiology
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fmicb.2014.00034/full
id doaj-239e294ed40a4102842f7e7905fb6ab7
record_format Article
spelling doaj-239e294ed40a4102842f7e7905fb6ab72020-11-25T00:18:40ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2014-02-01510.3389/fmicb.2014.0003467058Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGANMarc Gregory Dumont0Claudia eLüke1Claudia eLüke2Yongcui eDeng3Yongcui eDeng4Peter eFrenzel5Max-Planck-Institute for Terrestrial MicrobiologyMax-Planck-Institute for Terrestrial MicrobiologyRadboud UniversityMax-Planck-Institute for Terrestrial MicrobiologyUniversity of the Chinese Academy of SciencesMax-Planck-Institute for Terrestrial MicrobiologyThe classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences from cultivated and uncultivated organisms. Representative sequences from closely related genes, such as those encoding the bacterial ammonia monooxygenase, were also included in the pmoA taxonomy. In total, 53 low-level taxa (genus-level) are included. Using previously published datasets of high-throughput pmoA amplicon sequence data, we tested two approaches for classifying pmoA: a naïve Bayesian classifier and BLAST. Classification of pmoA sequences based on BLAST analyses was performed using the lowest common ancestor (LCA) algorithm in MEGAN, a software program commonly used for the analysis of metagenomic data. Both the naïve Bayesian and BLAST methods were able to classify pmoA sequences and provided similar classifications; however, the naïve Bayesian classifier was prone to misclassifying contaminant sequences present in the datasets. Another advantage of the BLAST/LCA method was that it provided a user-interpretable output and enabled novelty detection at various levels, from highly divergent pmoA sequences to genus-level novelty.  http://journal.frontiersin.org/Journal/10.3389/fmicb.2014.00034/fullpyrosequencingdiversitymethanotrophpmoANGS data analysis
collection DOAJ
language English
format Article
sources DOAJ
author Marc Gregory Dumont
Claudia eLüke
Claudia eLüke
Yongcui eDeng
Yongcui eDeng
Peter eFrenzel
spellingShingle Marc Gregory Dumont
Claudia eLüke
Claudia eLüke
Yongcui eDeng
Yongcui eDeng
Peter eFrenzel
Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
Frontiers in Microbiology
pyrosequencing
diversity
methanotroph
pmoA
NGS data analysis
author_facet Marc Gregory Dumont
Claudia eLüke
Claudia eLüke
Yongcui eDeng
Yongcui eDeng
Peter eFrenzel
author_sort Marc Gregory Dumont
title Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_short Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_full Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_fullStr Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_full_unstemmed Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
title_sort classification of pmoa amplicon pyrosequences using blast and the lowest common ancestor method in megan
publisher Frontiers Media S.A.
series Frontiers in Microbiology
issn 1664-302X
publishDate 2014-02-01
description The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences from cultivated and uncultivated organisms. Representative sequences from closely related genes, such as those encoding the bacterial ammonia monooxygenase, were also included in the pmoA taxonomy. In total, 53 low-level taxa (genus-level) are included. Using previously published datasets of high-throughput pmoA amplicon sequence data, we tested two approaches for classifying pmoA: a naïve Bayesian classifier and BLAST. Classification of pmoA sequences based on BLAST analyses was performed using the lowest common ancestor (LCA) algorithm in MEGAN, a software program commonly used for the analysis of metagenomic data. Both the naïve Bayesian and BLAST methods were able to classify pmoA sequences and provided similar classifications; however, the naïve Bayesian classifier was prone to misclassifying contaminant sequences present in the datasets. Another advantage of the BLAST/LCA method was that it provided a user-interpretable output and enabled novelty detection at various levels, from highly divergent pmoA sequences to genus-level novelty.  
topic pyrosequencing
diversity
methanotroph
pmoA
NGS data analysis
url http://journal.frontiersin.org/Journal/10.3389/fmicb.2014.00034/full
work_keys_str_mv AT marcgregorydumont classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
AT claudiaeluke classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
AT claudiaeluke classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
AT yongcuiedeng classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
AT yongcuiedeng classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
AT peterefrenzel classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan
_version_ 1725375290592460800