Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts.

Evidence from both GWAS and clinical observation has suggested that certain psychiatric, metabolic, and autoimmune diseases are heterogeneous, comprising multiple subtypes with distinct genomic etiologies and Polygenic Risk Scores (PRS). However, the presence of subtypes within many phenotypes is fr...

Full description

Bibliographic Details
Main Authors: Jie Yuan, Henry Xing, Alexandre Louis Lamy, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Todd Lencz, Itsik Pe'er
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-09-01
Series:PLoS Genetics
Online Access:https://doi.org/10.1371/journal.pgen.1009015
id doaj-6957cbd7c9034e58abce51f27d059b13
record_format Article
spelling doaj-6957cbd7c9034e58abce51f27d059b132021-04-21T14:35:03ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042020-09-01169e100901510.1371/journal.pgen.1009015Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts.Jie YuanHenry XingAlexandre Louis LamySchizophrenia Working Group of the Psychiatric Genomics ConsortiumTodd LenczItsik Pe'erEvidence from both GWAS and clinical observation has suggested that certain psychiatric, metabolic, and autoimmune diseases are heterogeneous, comprising multiple subtypes with distinct genomic etiologies and Polygenic Risk Scores (PRS). However, the presence of subtypes within many phenotypes is frequently unknown. We present CLiP (Correlated Liability Predictors), a method to detect heterogeneity in single GWAS cohorts. CLiP calculates a weighted sum of correlations between SNPs contributing to a PRS on the case/control liability scale. We demonstrate mathematically and through simulation that among i.i.d. homogeneous cases generated by a liability threshold model, significant anti-correlations are expected between otherwise independent predictors due to ascertainment on the hidden liability score. In the presence of heterogeneity from distinct etiologies, confounding by covariates, or mislabeling, these correlation patterns are altered predictably. We further extend our method to two additional association study designs: CLiP-X for quantitative predictors in applications such as transcriptome-wide association, and CLiP-Y for quantitative phenotypes, where there is no clear distinction between cases and controls. Through simulations, we demonstrate that CLiP and its extensions reliably distinguish between homogeneous and heterogeneous cohorts when the PRS explains as low as 3% of variance on the liability scale and cohorts comprise 50, 000 - 100, 000 samples, an increasingly practical size for modern GWAS. We apply CLiP to heterogeneity detection in schizophrenia cohorts totaling > 50, 000 cases and controls collected by the Psychiatric Genomics Consortium. We observe significant heterogeneity in mega-analysis of the combined PGC data (p-value 8.54 × 0-4), as well as in individual cohorts meta-analyzed using Fisher's method (p-value 0.03), based on significantly associated variants. We also apply CLiP-Y to detect heterogeneity in neuroticism in over 10, 000 individuals from the UK Biobank and detect heterogeneity with a p-value of 1.68 × 10-9. Scores were not significantly reduced when partitioning by known subclusters ("Depression" and "Worry"), suggesting that these factors are not the primary source of observed heterogeneity.https://doi.org/10.1371/journal.pgen.1009015
collection DOAJ
language English
format Article
sources DOAJ
author Jie Yuan
Henry Xing
Alexandre Louis Lamy
Schizophrenia Working Group of the Psychiatric Genomics Consortium
Todd Lencz
Itsik Pe'er
spellingShingle Jie Yuan
Henry Xing
Alexandre Louis Lamy
Schizophrenia Working Group of the Psychiatric Genomics Consortium
Todd Lencz
Itsik Pe'er
Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts.
PLoS Genetics
author_facet Jie Yuan
Henry Xing
Alexandre Louis Lamy
Schizophrenia Working Group of the Psychiatric Genomics Consortium
Todd Lencz
Itsik Pe'er
author_sort Jie Yuan
title Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts.
title_short Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts.
title_full Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts.
title_fullStr Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts.
title_full_unstemmed Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts.
title_sort leveraging correlations between variants in polygenic risk scores to detect heterogeneity in gwas cohorts.
publisher Public Library of Science (PLoS)
series PLoS Genetics
issn 1553-7390
1553-7404
publishDate 2020-09-01
description Evidence from both GWAS and clinical observation has suggested that certain psychiatric, metabolic, and autoimmune diseases are heterogeneous, comprising multiple subtypes with distinct genomic etiologies and Polygenic Risk Scores (PRS). However, the presence of subtypes within many phenotypes is frequently unknown. We present CLiP (Correlated Liability Predictors), a method to detect heterogeneity in single GWAS cohorts. CLiP calculates a weighted sum of correlations between SNPs contributing to a PRS on the case/control liability scale. We demonstrate mathematically and through simulation that among i.i.d. homogeneous cases generated by a liability threshold model, significant anti-correlations are expected between otherwise independent predictors due to ascertainment on the hidden liability score. In the presence of heterogeneity from distinct etiologies, confounding by covariates, or mislabeling, these correlation patterns are altered predictably. We further extend our method to two additional association study designs: CLiP-X for quantitative predictors in applications such as transcriptome-wide association, and CLiP-Y for quantitative phenotypes, where there is no clear distinction between cases and controls. Through simulations, we demonstrate that CLiP and its extensions reliably distinguish between homogeneous and heterogeneous cohorts when the PRS explains as low as 3% of variance on the liability scale and cohorts comprise 50, 000 - 100, 000 samples, an increasingly practical size for modern GWAS. We apply CLiP to heterogeneity detection in schizophrenia cohorts totaling > 50, 000 cases and controls collected by the Psychiatric Genomics Consortium. We observe significant heterogeneity in mega-analysis of the combined PGC data (p-value 8.54 × 0-4), as well as in individual cohorts meta-analyzed using Fisher's method (p-value 0.03), based on significantly associated variants. We also apply CLiP-Y to detect heterogeneity in neuroticism in over 10, 000 individuals from the UK Biobank and detect heterogeneity with a p-value of 1.68 × 10-9. Scores were not significantly reduced when partitioning by known subclusters ("Depression" and "Worry"), suggesting that these factors are not the primary source of observed heterogeneity.
url https://doi.org/10.1371/journal.pgen.1009015
work_keys_str_mv AT jieyuan leveragingcorrelationsbetweenvariantsinpolygenicriskscorestodetectheterogeneityingwascohorts
AT henryxing leveragingcorrelationsbetweenvariantsinpolygenicriskscorestodetectheterogeneityingwascohorts
AT alexandrelouislamy leveragingcorrelationsbetweenvariantsinpolygenicriskscorestodetectheterogeneityingwascohorts
AT schizophreniaworkinggroupofthepsychiatricgenomicsconsortium leveragingcorrelationsbetweenvariantsinpolygenicriskscorestodetectheterogeneityingwascohorts
AT toddlencz leveragingcorrelationsbetweenvariantsinpolygenicriskscorestodetectheterogeneityingwascohorts
AT itsikpeer leveragingcorrelationsbetweenvariantsinpolygenicriskscorestodetectheterogeneityingwascohorts
_version_ 1714668152573394944