Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?
Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict th...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2016-02-01
|
Series: | PLoS Genetics |
Online Access: | http://europepmc.org/articles/PMC4764260?pdf=render |
id |
doaj-3f26b1e8dbfb43d88585cf94cd32c647 |
---|---|
record_format |
Article |
spelling |
doaj-3f26b1e8dbfb43d88585cf94cd32c6472020-11-25T00:07:26ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042016-02-01122e100587510.1371/journal.pgen.1005875Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?Gregory A MoyerbraileanCynthia A KalitaChris T HarveyXiaoquan WenFrancesca LucaRoger Pique-RegiLarge experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8 million genetic variants in footprints, 66% of which are predicted by our model to affect TF binding. Comprehensive examination using allele-specific hypersensitivity (ASH) reveals that only the latter group consistently shows evidence for ASH (3,217 SNPs at 20% FDR), suggesting that most (97%) genetic variants in footprinted regulatory regions are indeed silent. Combining this information with GWAS data reveals that our annotation helps in computationally fine-mapping 86 SNPs in GWAS hit regions with at least a 2-fold increase in the posterior odds of picking the causal SNP. The rich meta information provided by the tissue-specificity and the identity of the putative TF binding site being affected also helps in identifying the underlying mechanism supporting the association. As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation.http://europepmc.org/articles/PMC4764260?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Gregory A Moyerbrailean Cynthia A Kalita Chris T Harvey Xiaoquan Wen Francesca Luca Roger Pique-Regi |
spellingShingle |
Gregory A Moyerbrailean Cynthia A Kalita Chris T Harvey Xiaoquan Wen Francesca Luca Roger Pique-Regi Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? PLoS Genetics |
author_facet |
Gregory A Moyerbrailean Cynthia A Kalita Chris T Harvey Xiaoquan Wen Francesca Luca Roger Pique-Regi |
author_sort |
Gregory A Moyerbrailean |
title |
Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_short |
Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_full |
Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_fullStr |
Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_full_unstemmed |
Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding? |
title_sort |
which genetics variants in dnase-seq footprints are more likely to alter binding? |
publisher |
Public Library of Science (PLoS) |
series |
PLoS Genetics |
issn |
1553-7390 1553-7404 |
publishDate |
2016-02-01 |
description |
Large experimental efforts are characterizing the regulatory genome, yet we are still missing a systematic definition of functional and silent genetic variants in non-coding regions. Here, we integrated DNaseI footprinting data with sequence-based transcription factor (TF) motif models to predict the impact of a genetic variant on TF binding across 153 tissues and 1,372 TF motifs. Each annotation we derived is specific for a cell-type condition or assay and is locally motif-driven. We found 5.8 million genetic variants in footprints, 66% of which are predicted by our model to affect TF binding. Comprehensive examination using allele-specific hypersensitivity (ASH) reveals that only the latter group consistently shows evidence for ASH (3,217 SNPs at 20% FDR), suggesting that most (97%) genetic variants in footprinted regulatory regions are indeed silent. Combining this information with GWAS data reveals that our annotation helps in computationally fine-mapping 86 SNPs in GWAS hit regions with at least a 2-fold increase in the posterior odds of picking the causal SNP. The rich meta information provided by the tissue-specificity and the identity of the putative TF binding site being affected also helps in identifying the underlying mechanism supporting the association. As an example, the enrichment for LDL level-associated SNPs is 9.1-fold higher among SNPs predicted to affect HNF4 binding sites than in a background model already including tissue-specific annotation. |
url |
http://europepmc.org/articles/PMC4764260?pdf=render |
work_keys_str_mv |
AT gregoryamoyerbrailean whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT cynthiaakalita whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT christharvey whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT xiaoquanwen whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT francescaluca whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding AT rogerpiqueregi whichgeneticsvariantsindnaseseqfootprintsaremorelikelytoalterbinding |
_version_ |
1725418219559190528 |