Multiple-testing correction in metabolome-wide association studies

Background: The search for statistically significant relationships between molecular markers and outcomes is challenging when dealing with high-dimensional, noisy and collinear multivariate omics data, such as metabolomic profiles. Permutation procedures allow for the estimation of adjusted signific...

Full description

Bibliographic Details
Main Authors: Ebbels, T.M.D (Author), Glen, R. (Author), Peluso, A. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 03252nam a2200589Ia 4500
001 10.1186-s12859-021-03975-2
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a Multiple-testing correction in metabolome-wide association studies 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-021-03975-2 
520 3 |a Background: The search for statistically significant relationships between molecular markers and outcomes is challenging when dealing with high-dimensional, noisy and collinear multivariate omics data, such as metabolomic profiles. Permutation procedures allow for the estimation of adjusted significance levels without assuming independence among metabolomic variables. Nevertheless, the complex non-normal structure of metabolic profiles and outcomes may bias the permutation results leading to overly conservative threshold estimates i.e. lower than those from a Bonferroni or Sidak correction. Methods: Within a univariate permutation procedure we employ parametric simulation methods based on the multivariate (log-)Normal distribution to obtain adjusted significance levels which are consistent across different outcomes while effectively controlling the type I error rate. Next, we derive an alternative closed-form expression for the estimation of the number of non-redundant metabolic variates based on the spectral decomposition of their correlation matrix. The performance of the method is tested for different model parametrizations and across a wide range of correlation levels of the variates using synthetic and real data sets. Results: Both the permutation-based formulation and the more practical closed form expression are found to give an effective indication of the number of independent metabolic effects exhibited by the system, while guaranteeing that the derived adjusted threshold is stable across outcome measures with diverse properties. © 2021, The Author(s). 
650 0 4 |a article 
650 0 4 |a biological model 
650 0 4 |a Closed-form expression 
650 0 4 |a Correlated tests 
650 0 4 |a Correlation matrix 
650 0 4 |a decomposition 
650 0 4 |a Diverse properties 
650 0 4 |a Error analysis 
650 0 4 |a family-wise error rate 
650 0 4 |a FWER 
650 0 4 |a genetic marker 
650 0 4 |a Genetic Markers 
650 0 4 |a genetics 
650 0 4 |a human 
650 0 4 |a Metabolism 
650 0 4 |a metabolome 
650 0 4 |a Metabolome 
650 0 4 |a metabolomics 
650 0 4 |a Metabolomics 
650 0 4 |a Models, Biological 
650 0 4 |a Multiple testing 
650 0 4 |a MWAS 
650 0 4 |a MWSL 
650 0 4 |a normal distribution 
650 0 4 |a Normal distribution 
650 0 4 |a outcome assessment 
650 0 4 |a Parametric simulations 
650 0 4 |a Permutation 
650 0 4 |a Permutation procedures 
650 0 4 |a procedures 
650 0 4 |a Significance levels 
650 0 4 |a simulation 
650 0 4 |a Spectral decomposition 
650 0 4 |a statistical distribution 
650 0 4 |a Statistical Distributions 
650 0 4 |a Synthetic and real data 
700 1 |a Ebbels, T.M.D.  |e author 
700 1 |a Glen, R.  |e author 
700 1 |a Peluso, A.  |e author 
773 |t BMC Bioinformatics