Longitudinal data analysis for rare variants detection with penalized quadratic inference function

Abstract Longitudinal genetic data provide more information regarding genetic effects over time compared with cross-sectional data. Coupled with next-generation sequencing technologies, it becomes reality to identify important genes containing both rare and common variants in a longitudinal design....

Full description

Bibliographic Details
Main Authors: Hongyan Cao, Zhi Li, Haitao Yang, Yuehua Cui, Yanbo Zhang
Format: Article
Language:English
Published: Nature Publishing Group 2017-04-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-017-00712-9
id doaj-df13efaeca734d6c994cf7dd22e9c555
record_format Article
spelling doaj-df13efaeca734d6c994cf7dd22e9c5552020-12-08T00:17:27ZengNature Publishing GroupScientific Reports2045-23222017-04-017111110.1038/s41598-017-00712-9Longitudinal data analysis for rare variants detection with penalized quadratic inference functionHongyan Cao0Zhi Li1Haitao Yang2Yuehua Cui3Yanbo Zhang4Shanxi Medical University, Department of Health StatisticsNorth University of China, School of Sport and Physical EducationHebei Medical University, Department of Epidemiology and Health StatisticsShanxi Medical University, Department of Health StatisticsShanxi Medical University, Department of Health StatisticsAbstract Longitudinal genetic data provide more information regarding genetic effects over time compared with cross-sectional data. Coupled with next-generation sequencing technologies, it becomes reality to identify important genes containing both rare and common variants in a longitudinal design. In this work, we adopted a weighted sum statistic (WSS) to collapse multiple variants in a gene region to form a gene score. When multiple genes in a pathway were considered together, a penalized longitudinal model under the quadratic inference function (QIF) framework was applied for efficient gene selection. We evaluated the estimation accuracy and model selection performance under different model settings, then applied the method to a real dataset from the Genetic Analysis Workshop 18 (GAW18). Compared with the unpenalized QIF method, the penalized QIF (pQIF) method achieved better estimation accuracy and higher selection efficiency. The pQIF remained optimal even when the working correlation structure was mis-specified. The real data analysis identified one important gene, angiotensin II receptor type 1 (AGTR1), in the Ca2+/AT-IIR/α-AR signaling pathway. The estimated effect implied that AGTR1 may have a protective effect for hypertension. Our pQIF method provides a general tool for longitudinal sequencing studies involving large numbers of genetic variants.https://doi.org/10.1038/s41598-017-00712-9
collection DOAJ
language English
format Article
sources DOAJ
author Hongyan Cao
Zhi Li
Haitao Yang
Yuehua Cui
Yanbo Zhang
spellingShingle Hongyan Cao
Zhi Li
Haitao Yang
Yuehua Cui
Yanbo Zhang
Longitudinal data analysis for rare variants detection with penalized quadratic inference function
Scientific Reports
author_facet Hongyan Cao
Zhi Li
Haitao Yang
Yuehua Cui
Yanbo Zhang
author_sort Hongyan Cao
title Longitudinal data analysis for rare variants detection with penalized quadratic inference function
title_short Longitudinal data analysis for rare variants detection with penalized quadratic inference function
title_full Longitudinal data analysis for rare variants detection with penalized quadratic inference function
title_fullStr Longitudinal data analysis for rare variants detection with penalized quadratic inference function
title_full_unstemmed Longitudinal data analysis for rare variants detection with penalized quadratic inference function
title_sort longitudinal data analysis for rare variants detection with penalized quadratic inference function
publisher Nature Publishing Group
series Scientific Reports
issn 2045-2322
publishDate 2017-04-01
description Abstract Longitudinal genetic data provide more information regarding genetic effects over time compared with cross-sectional data. Coupled with next-generation sequencing technologies, it becomes reality to identify important genes containing both rare and common variants in a longitudinal design. In this work, we adopted a weighted sum statistic (WSS) to collapse multiple variants in a gene region to form a gene score. When multiple genes in a pathway were considered together, a penalized longitudinal model under the quadratic inference function (QIF) framework was applied for efficient gene selection. We evaluated the estimation accuracy and model selection performance under different model settings, then applied the method to a real dataset from the Genetic Analysis Workshop 18 (GAW18). Compared with the unpenalized QIF method, the penalized QIF (pQIF) method achieved better estimation accuracy and higher selection efficiency. The pQIF remained optimal even when the working correlation structure was mis-specified. The real data analysis identified one important gene, angiotensin II receptor type 1 (AGTR1), in the Ca2+/AT-IIR/α-AR signaling pathway. The estimated effect implied that AGTR1 may have a protective effect for hypertension. Our pQIF method provides a general tool for longitudinal sequencing studies involving large numbers of genetic variants.
url https://doi.org/10.1038/s41598-017-00712-9
work_keys_str_mv AT hongyancao longitudinaldataanalysisforrarevariantsdetectionwithpenalizedquadraticinferencefunction
AT zhili longitudinaldataanalysisforrarevariantsdetectionwithpenalizedquadraticinferencefunction
AT haitaoyang longitudinaldataanalysisforrarevariantsdetectionwithpenalizedquadraticinferencefunction
AT yuehuacui longitudinaldataanalysisforrarevariantsdetectionwithpenalizedquadraticinferencefunction
AT yanbozhang longitudinaldataanalysisforrarevariantsdetectionwithpenalizedquadraticinferencefunction
_version_ 1724396504669487104