A two-stage microbial association mapping framework with advanced FDR control

Abstract Background In microbiome studies, it is important to detect taxa which are associated with pathological outcomes at the lowest definable taxonomic rank, such as genus or species. Traditionally, taxa at the target rank are tested for individual association, followed by the Benjamini-Hochberg...

Full description

Bibliographic Details
Main Authors: Jiyuan Hu, Hyunwook Koh, Linchen He, Menghan Liu, Martin J. Blaser, Huilin Li
Format: Article
Language:English
Published: BMC 2018-07-01
Series:Microbiome
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40168-018-0517-1
id doaj-1fb9b2f922694677ac3a783669a7e8b1
record_format Article
spelling doaj-1fb9b2f922694677ac3a783669a7e8b12020-11-25T02:40:40ZengBMCMicrobiome2049-26182018-07-016111610.1186/s40168-018-0517-1A two-stage microbial association mapping framework with advanced FDR controlJiyuan Hu0Hyunwook Koh1Linchen He2Menghan Liu3Martin J. Blaser4Huilin Li5Division of Biostatistics, Department of Population Health, New York University School of MedicineDivision of Biostatistics, Department of Population Health, New York University School of MedicineDivision of Biostatistics, Department of Population Health, New York University School of MedicineDepartment of Medicine, New York University School of MedicineDepartment of Medicine, New York University School of MedicineDivision of Biostatistics, Department of Population Health, New York University School of MedicineAbstract Background In microbiome studies, it is important to detect taxa which are associated with pathological outcomes at the lowest definable taxonomic rank, such as genus or species. Traditionally, taxa at the target rank are tested for individual association, followed by the Benjamini-Hochberg (BH) procedure to control for false discovery rate (FDR). However, this approach neglects the dependence structure among taxa and may lead to conservative results. The taxonomic tree of microbiome data represents alignment from phylum to species rank and characterizes evolutionary relationships across microbial taxa. Taxa that are closer on the tree usually have similar responses to the exposure (environment). The statistical power in microbial association tests can be enhanced by efficiently employing the prior evolutionary information via the taxonomic tree. Methods We propose a two-stage microbial association mapping framework (massMap) which uses grouping information from the taxonomic tree to strengthen statistical power in association tests at the target rank. massMap first screens the association of taxonomic groups at a pre-selected higher taxonomic rank using a powerful microbial group test OMiAT. The method then proceeds to test the association for each candidate taxon at the target rank within the significant taxonomic groups identified in the first stage. Hierarchical BH (HBH) and selected subset testing (SST) procedures are evaluated to control the FDR for the two-stage structured tests. Results Our simulations show that massMap incorporating OMiAT and the advanced FDR controlling methodologies largely alleviates the multiplicity issue. It is statistically more powerful than the traditional association mapping directly at the target rank while controlling the FDR at desired levels under most scenarios. In our real data analyses, massMap detects more or the same amount of associated species with smaller adjusted p values compared to the traditional method, which further illustrates the efficiency of the proposed framework. The R package of massMap is publicly available at https://sites.google.com/site/huilinli09/software and https://github.com/JiyuanHu/. Conclusions massMap is a novel microbial association mapping framework and achieves additional efficiency by utilizing the intrinsic taxonomic structure of microbiome data.http://link.springer.com/article/10.1186/s40168-018-0517-1MicrobiomeTwo-stage microbial association mappingTaxonomic treeMicrobial group association testFalse discovery rateHierarchical BH
collection DOAJ
language English
format Article
sources DOAJ
author Jiyuan Hu
Hyunwook Koh
Linchen He
Menghan Liu
Martin J. Blaser
Huilin Li
spellingShingle Jiyuan Hu
Hyunwook Koh
Linchen He
Menghan Liu
Martin J. Blaser
Huilin Li
A two-stage microbial association mapping framework with advanced FDR control
Microbiome
Microbiome
Two-stage microbial association mapping
Taxonomic tree
Microbial group association test
False discovery rate
Hierarchical BH
author_facet Jiyuan Hu
Hyunwook Koh
Linchen He
Menghan Liu
Martin J. Blaser
Huilin Li
author_sort Jiyuan Hu
title A two-stage microbial association mapping framework with advanced FDR control
title_short A two-stage microbial association mapping framework with advanced FDR control
title_full A two-stage microbial association mapping framework with advanced FDR control
title_fullStr A two-stage microbial association mapping framework with advanced FDR control
title_full_unstemmed A two-stage microbial association mapping framework with advanced FDR control
title_sort two-stage microbial association mapping framework with advanced fdr control
publisher BMC
series Microbiome
issn 2049-2618
publishDate 2018-07-01
description Abstract Background In microbiome studies, it is important to detect taxa which are associated with pathological outcomes at the lowest definable taxonomic rank, such as genus or species. Traditionally, taxa at the target rank are tested for individual association, followed by the Benjamini-Hochberg (BH) procedure to control for false discovery rate (FDR). However, this approach neglects the dependence structure among taxa and may lead to conservative results. The taxonomic tree of microbiome data represents alignment from phylum to species rank and characterizes evolutionary relationships across microbial taxa. Taxa that are closer on the tree usually have similar responses to the exposure (environment). The statistical power in microbial association tests can be enhanced by efficiently employing the prior evolutionary information via the taxonomic tree. Methods We propose a two-stage microbial association mapping framework (massMap) which uses grouping information from the taxonomic tree to strengthen statistical power in association tests at the target rank. massMap first screens the association of taxonomic groups at a pre-selected higher taxonomic rank using a powerful microbial group test OMiAT. The method then proceeds to test the association for each candidate taxon at the target rank within the significant taxonomic groups identified in the first stage. Hierarchical BH (HBH) and selected subset testing (SST) procedures are evaluated to control the FDR for the two-stage structured tests. Results Our simulations show that massMap incorporating OMiAT and the advanced FDR controlling methodologies largely alleviates the multiplicity issue. It is statistically more powerful than the traditional association mapping directly at the target rank while controlling the FDR at desired levels under most scenarios. In our real data analyses, massMap detects more or the same amount of associated species with smaller adjusted p values compared to the traditional method, which further illustrates the efficiency of the proposed framework. The R package of massMap is publicly available at https://sites.google.com/site/huilinli09/software and https://github.com/JiyuanHu/. Conclusions massMap is a novel microbial association mapping framework and achieves additional efficiency by utilizing the intrinsic taxonomic structure of microbiome data.
topic Microbiome
Two-stage microbial association mapping
Taxonomic tree
Microbial group association test
False discovery rate
Hierarchical BH
url http://link.springer.com/article/10.1186/s40168-018-0517-1
work_keys_str_mv AT jiyuanhu atwostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT hyunwookkoh atwostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT linchenhe atwostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT menghanliu atwostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT martinjblaser atwostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT huilinli atwostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT jiyuanhu twostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT hyunwookkoh twostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT linchenhe twostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT menghanliu twostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT martinjblaser twostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
AT huilinli twostagemicrobialassociationmappingframeworkwithadvancedfdrcontrol
_version_ 1724780389073944576