Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)

The genetic architecture of complex human traits and diseases is affected by large number of possibly interacting genes, but detecting epistatic interactions can be challenging. In the last decade, several studies have alluded to problems that linkage disequilibrium can create when testing for epist...

Full description

Bibliographic Details
Main Authors: Gustavo de los Campos, Daniel Alberto Sorensen, Miguel Angel Toro
Format: Article
Language:English
Published: Oxford University Press 2019-05-01
Series:G3: Genes, Genomes, Genetics
Subjects:
Online Access:http://g3journal.org/lookup/doi/10.1534/g3.119.400101
id doaj-3c46d5721229434585c32a5607b6f92e
record_format Article
spelling doaj-3c46d5721229434585c32a5607b6f92e2021-07-02T09:41:45ZengOxford University PressG3: Genes, Genomes, Genetics2160-18362019-05-01951429143610.1534/g3.119.40010114Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)Gustavo de los CamposDaniel Alberto SorensenMiguel Angel ToroThe genetic architecture of complex human traits and diseases is affected by large number of possibly interacting genes, but detecting epistatic interactions can be challenging. In the last decade, several studies have alluded to problems that linkage disequilibrium can create when testing for epistatic interactions between DNA markers. However, these problems have not been formalized nor have their consequences been quantified in a precise manner. Here we use a conceptually simple three locus model involving a causal locus and two markers to show that imperfect LD can generate the illusion of epistasis, even when the underlying genetic architecture is purely additive. We describe necessary conditions for such “phantom epistasis” to emerge and quantify its relevance using simulations. Our empirical results demonstrate that phantom epistasis can be a very serious problem in GWAS studies (with rejection rates against the additive model greater than 0.28 for nominal p-values of 0.05, even when the model is purely additive). Some studies have sought to avoid this problem by only testing interactions between SNPs with R-sq. <0.1. We show that this threshold is not appropriate and demonstrate that the magnitude of the problem is even greater with large sample size, intermediate allele frequencies, and when the causal locus explains a large amount of phenotypic variance. We conclude that caution must be exercised when interpreting GWAS results derived from very large data sets showing strong evidence in support of epistatic interactions between markers.http://g3journal.org/lookup/doi/10.1534/g3.119.400101epistasisapparent epistasisphantom epistasisGWASlinkage disequilibriumimperfect LDmissing heritabilityBig Data
collection DOAJ
language English
format Article
sources DOAJ
author Gustavo de los Campos
Daniel Alberto Sorensen
Miguel Angel Toro
spellingShingle Gustavo de los Campos
Daniel Alberto Sorensen
Miguel Angel Toro
Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
G3: Genes, Genomes, Genetics
epistasis
apparent epistasis
phantom epistasis
GWAS
linkage disequilibrium
imperfect LD
missing heritability
Big Data
author_facet Gustavo de los Campos
Daniel Alberto Sorensen
Miguel Angel Toro
author_sort Gustavo de los Campos
title Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_short Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_full Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_fullStr Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_full_unstemmed Imperfect Linkage Disequilibrium Generates Phantom Epistasis (& Perils of Big Data)
title_sort imperfect linkage disequilibrium generates phantom epistasis (& perils of big data)
publisher Oxford University Press
series G3: Genes, Genomes, Genetics
issn 2160-1836
publishDate 2019-05-01
description The genetic architecture of complex human traits and diseases is affected by large number of possibly interacting genes, but detecting epistatic interactions can be challenging. In the last decade, several studies have alluded to problems that linkage disequilibrium can create when testing for epistatic interactions between DNA markers. However, these problems have not been formalized nor have their consequences been quantified in a precise manner. Here we use a conceptually simple three locus model involving a causal locus and two markers to show that imperfect LD can generate the illusion of epistasis, even when the underlying genetic architecture is purely additive. We describe necessary conditions for such “phantom epistasis” to emerge and quantify its relevance using simulations. Our empirical results demonstrate that phantom epistasis can be a very serious problem in GWAS studies (with rejection rates against the additive model greater than 0.28 for nominal p-values of 0.05, even when the model is purely additive). Some studies have sought to avoid this problem by only testing interactions between SNPs with R-sq. <0.1. We show that this threshold is not appropriate and demonstrate that the magnitude of the problem is even greater with large sample size, intermediate allele frequencies, and when the causal locus explains a large amount of phenotypic variance. We conclude that caution must be exercised when interpreting GWAS results derived from very large data sets showing strong evidence in support of epistatic interactions between markers.
topic epistasis
apparent epistasis
phantom epistasis
GWAS
linkage disequilibrium
imperfect LD
missing heritability
Big Data
url http://g3journal.org/lookup/doi/10.1534/g3.119.400101
work_keys_str_mv AT gustavodeloscampos imperfectlinkagedisequilibriumgeneratesphantomepistasisperilsofbigdata
AT danielalbertosorensen imperfectlinkagedisequilibriumgeneratesphantomepistasisperilsofbigdata
AT miguelangeltoro imperfectlinkagedisequilibriumgeneratesphantomepistasisperilsofbigdata
_version_ 1721332851350175744