A comparative analysis of family-based and population-based association tests using whole genome sequence data

The revolution in next-generation sequencing has made obtaining both common and rare high-quality sequence variants across the entire genome feasible. Because researchers are now faced with the analytical challenges of handling a massive amount of genetic variant information from sequencing studies,...

Full description

Bibliographic Details
Main Authors: Zhou, Jin, Yip, Wai-Ki, Cho, Michael, Qiao, Dandi, McDonald, Merry-Lynn, Laird, Nan
Other Authors: Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA
Language:en
Published: BioMed Central 2014
Online Access:Zhou et al. BMC Proceedings 2014, 8(Suppl 1):S33 http://www.biomedcentral.com/1753-6561/8/S1/S33
http://hdl.handle.net/10150/610090
http://arizona.openrepository.com/arizona/handle/10150/610090
Description
Summary:The revolution in next-generation sequencing has made obtaining both common and rare high-quality sequence variants across the entire genome feasible. Because researchers are now faced with the analytical challenges of handling a massive amount of genetic variant information from sequencing studies, numerous methods have been developed to assess the impact of both common and rare variants on disease traits. In this report, whole genome sequencing data from Genetic Analysis Workshop 18 was used to compare the power of several methods, considering both family-based and population-based designs, to detect association with variants in the MAP4 gene region and on chromosome 3 with blood pressure. To prioritize variants across the genome for testing, variants were first functionally assessed using prediction algorithms and expression quantitative trait loci (eQTLs) data. Four set-based tests in the family-based association tests (FBAT) framework--FBAT-v, FBAT-lmm, FBAT-m, and FBAT-l--were used to analyze 20 pedigrees, and 2 variance component tests, sequence kernel association test (SKAT) and genome-wide complex trait analysis (GCTA), were used with 142 unrelated individuals in the sample. Both set-based and variance-component-based tests had high power and an adequate type I error rate. Of the various FBATs, FBAT-l demonstrated superior performance, indicating the potential for it to be used in rare-variant analysis. The updated FBAT package is available at: http://www.hsph.harvard.edu/fbat/ webcite.