A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.

In the area of genetic epidemiology, studies of the genotype-phenotype associations have made significant contributions to human complicated trait genetics. These studies depend on specialized statistical methods for uncover the association between traits and genetic variants, both common and rare v...

Full description

Bibliographic Details
Main Author: Li-Chu Chien
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0233847
id doaj-bb8e9151c3f24cd0884096635d44a4e8
record_format Article
spelling doaj-bb8e9151c3f24cd0884096635d44a4e82021-03-03T21:49:05ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01156e023384710.1371/journal.pone.0233847A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.Li-Chu ChienIn the area of genetic epidemiology, studies of the genotype-phenotype associations have made significant contributions to human complicated trait genetics. These studies depend on specialized statistical methods for uncover the association between traits and genetic variants, both common and rare variants. Often, in analyzing such studies, potentially confounding factors, such as social and environmental conditions, are required to be involved. Multiple linear regression is the most widely used type of regression analysis when the outcome of interest is quantitative traits. Many statistical tests for identifying genotype-phenotype associations using linear regression rely on the assumption that the traits (or the residuals) of the regression follow a normal distribution. In genomic research, the rank-based inverse normal transformation (INT) is one of the most popular approaches to reach normally distributed traits (or normally distributed residuals). Many researchers believe that applying the INT to the non-normality of the traits (or the non-normality of the residuals) is required for valid inference, because the phenotypic (or residual) outliers and non-normality have the significant influence on both the type I error rate control and statistical power, especially under the situation in rare-variant association testing procedures. Here we propose a test for exploring the association of the rare variant with the quantitative trait by using a fully adjusted full-stage INT. Using simulations we show that the fully adjusted full-stage INT is more appropriate than the existing INT methods, such as the fully adjusted two-stage INT and the INT-based omnibus test, in testing genotype-phenotype associations with rare variants, especially when genotypes are uncorrelated with covariates. The fully adjusted full-stage INT retains the advantages of the fully adjusted two-stage INT and ameliorates the problems of the fully adjusted two-stage INT for analysis of rare variants under non-normality of the trait. We also present theoretical results on these desirable properties. In addition, the two available methods with non-normal traits, the quantile/median regression method and the Yeo-Johnson power transformation, are also included in simulations for comparison with these desirable properties.https://doi.org/10.1371/journal.pone.0233847
collection DOAJ
language English
format Article
sources DOAJ
author Li-Chu Chien
spellingShingle Li-Chu Chien
A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.
PLoS ONE
author_facet Li-Chu Chien
author_sort Li-Chu Chien
title A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.
title_short A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.
title_full A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.
title_fullStr A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.
title_full_unstemmed A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.
title_sort rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2020-01-01
description In the area of genetic epidemiology, studies of the genotype-phenotype associations have made significant contributions to human complicated trait genetics. These studies depend on specialized statistical methods for uncover the association between traits and genetic variants, both common and rare variants. Often, in analyzing such studies, potentially confounding factors, such as social and environmental conditions, are required to be involved. Multiple linear regression is the most widely used type of regression analysis when the outcome of interest is quantitative traits. Many statistical tests for identifying genotype-phenotype associations using linear regression rely on the assumption that the traits (or the residuals) of the regression follow a normal distribution. In genomic research, the rank-based inverse normal transformation (INT) is one of the most popular approaches to reach normally distributed traits (or normally distributed residuals). Many researchers believe that applying the INT to the non-normality of the traits (or the non-normality of the residuals) is required for valid inference, because the phenotypic (or residual) outliers and non-normality have the significant influence on both the type I error rate control and statistical power, especially under the situation in rare-variant association testing procedures. Here we propose a test for exploring the association of the rare variant with the quantitative trait by using a fully adjusted full-stage INT. Using simulations we show that the fully adjusted full-stage INT is more appropriate than the existing INT methods, such as the fully adjusted two-stage INT and the INT-based omnibus test, in testing genotype-phenotype associations with rare variants, especially when genotypes are uncorrelated with covariates. The fully adjusted full-stage INT retains the advantages of the fully adjusted two-stage INT and ameliorates the problems of the fully adjusted two-stage INT for analysis of rare variants under non-normality of the trait. We also present theoretical results on these desirable properties. In addition, the two available methods with non-normal traits, the quantile/median regression method and the Yeo-Johnson power transformation, are also included in simulations for comparison with these desirable properties.
url https://doi.org/10.1371/journal.pone.0233847
work_keys_str_mv AT lichuchien arankbasednormalizationmethodwiththefullyadjustedfullstageprocedureingeneticassociationstudies
AT lichuchien rankbasednormalizationmethodwiththefullyadjustedfullstageprocedureingeneticassociationstudies
_version_ 1714814954865950720