Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN
The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2014-02-01
|
Series: | Frontiers in Microbiology |
Subjects: | |
Online Access: | http://journal.frontiersin.org/Journal/10.3389/fmicb.2014.00034/full |
id |
doaj-239e294ed40a4102842f7e7905fb6ab7 |
---|---|
record_format |
Article |
spelling |
doaj-239e294ed40a4102842f7e7905fb6ab72020-11-25T00:18:40ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2014-02-01510.3389/fmicb.2014.0003467058Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGANMarc Gregory Dumont0Claudia eLüke1Claudia eLüke2Yongcui eDeng3Yongcui eDeng4Peter eFrenzel5Max-Planck-Institute for Terrestrial MicrobiologyMax-Planck-Institute for Terrestrial MicrobiologyRadboud UniversityMax-Planck-Institute for Terrestrial MicrobiologyUniversity of the Chinese Academy of SciencesMax-Planck-Institute for Terrestrial MicrobiologyThe classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences from cultivated and uncultivated organisms. Representative sequences from closely related genes, such as those encoding the bacterial ammonia monooxygenase, were also included in the pmoA taxonomy. In total, 53 low-level taxa (genus-level) are included. Using previously published datasets of high-throughput pmoA amplicon sequence data, we tested two approaches for classifying pmoA: a naïve Bayesian classifier and BLAST. Classification of pmoA sequences based on BLAST analyses was performed using the lowest common ancestor (LCA) algorithm in MEGAN, a software program commonly used for the analysis of metagenomic data. Both the naïve Bayesian and BLAST methods were able to classify pmoA sequences and provided similar classifications; however, the naïve Bayesian classifier was prone to misclassifying contaminant sequences present in the datasets. Another advantage of the BLAST/LCA method was that it provided a user-interpretable output and enabled novelty detection at various levels, from highly divergent pmoA sequences to genus-level novelty. http://journal.frontiersin.org/Journal/10.3389/fmicb.2014.00034/fullpyrosequencingdiversitymethanotrophpmoANGS data analysis |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Marc Gregory Dumont Claudia eLüke Claudia eLüke Yongcui eDeng Yongcui eDeng Peter eFrenzel |
spellingShingle |
Marc Gregory Dumont Claudia eLüke Claudia eLüke Yongcui eDeng Yongcui eDeng Peter eFrenzel Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN Frontiers in Microbiology pyrosequencing diversity methanotroph pmoA NGS data analysis |
author_facet |
Marc Gregory Dumont Claudia eLüke Claudia eLüke Yongcui eDeng Yongcui eDeng Peter eFrenzel |
author_sort |
Marc Gregory Dumont |
title |
Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN |
title_short |
Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN |
title_full |
Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN |
title_fullStr |
Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN |
title_full_unstemmed |
Classification of pmoA amplicon pyrosequences using BLAST and the lowest common ancestor method in MEGAN |
title_sort |
classification of pmoa amplicon pyrosequences using blast and the lowest common ancestor method in megan |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Microbiology |
issn |
1664-302X |
publishDate |
2014-02-01 |
description |
The classification of high-throughput sequencing data of protein-encoding genes is not as well established as for 16S rRNA. The objective of this work was to develop a simple and accurate method of classifying large datasets of pmoA sequences, a common marker for methanotrophic bacteria. A taxonomic system for pmoA was developed based on a phylogenetic analysis of available sequences. The taxonomy incorporates the known diversity of pmoA present in public databases, including both sequences from cultivated and uncultivated organisms. Representative sequences from closely related genes, such as those encoding the bacterial ammonia monooxygenase, were also included in the pmoA taxonomy. In total, 53 low-level taxa (genus-level) are included. Using previously published datasets of high-throughput pmoA amplicon sequence data, we tested two approaches for classifying pmoA: a naïve Bayesian classifier and BLAST. Classification of pmoA sequences based on BLAST analyses was performed using the lowest common ancestor (LCA) algorithm in MEGAN, a software program commonly used for the analysis of metagenomic data. Both the naïve Bayesian and BLAST methods were able to classify pmoA sequences and provided similar classifications; however, the naïve Bayesian classifier was prone to misclassifying contaminant sequences present in the datasets. Another advantage of the BLAST/LCA method was that it provided a user-interpretable output and enabled novelty detection at various levels, from highly divergent pmoA sequences to genus-level novelty. |
topic |
pyrosequencing diversity methanotroph pmoA NGS data analysis |
url |
http://journal.frontiersin.org/Journal/10.3389/fmicb.2014.00034/full |
work_keys_str_mv |
AT marcgregorydumont classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan AT claudiaeluke classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan AT claudiaeluke classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan AT yongcuiedeng classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan AT yongcuiedeng classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan AT peterefrenzel classificationofpmoaampliconpyrosequencesusingblastandthelowestcommonancestormethodinmegan |
_version_ |
1725375290592460800 |