Learning a prior on regulatory potential from eQTL data.

Genome-wide RNA expression data provide a detailed view of an organism's biological state; hence, a dataset measuring expression variation between genetically diverse individuals (eQTL data) may provide important insights into the genetics of complex traits. However, with data from a relatively...

Full description

Bibliographic Details
Main Authors: Su-In Lee, Aimée M Dudley, David Drubin, Pamela A Silver, Nevan J Krogan, Dana Pe'er, Daphne Koller
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2009-01-01
Series:PLoS Genetics
Online Access:http://europepmc.org/articles/PMC2627940?pdf=render
id doaj-75832323fdd44b459c176e2ccaa96fd0
record_format Article
spelling doaj-75832323fdd44b459c176e2ccaa96fd02020-11-25T02:23:50ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042009-01-0151e100035810.1371/journal.pgen.1000358Learning a prior on regulatory potential from eQTL data.Su-In LeeAimée M DudleyDavid DrubinPamela A SilverNevan J KroganDana Pe'erDaphne KollerGenome-wide RNA expression data provide a detailed view of an organism's biological state; hence, a dataset measuring expression variation between genetically diverse individuals (eQTL data) may provide important insights into the genetics of complex traits. However, with data from a relatively small number of individuals, it is difficult to distinguish true causal polymorphisms from the large number of possibilities. The problem is particularly challenging in populations with significant linkage disequilibrium, where traits are often linked to large chromosomal regions containing many genes. Here, we present a novel method, Lirnet, that automatically learns a regulatory potential for each sequence polymorphism, estimating how likely it is to have a significant effect on gene expression. This regulatory potential is defined in terms of "regulatory features"-including the function of the gene and the conservation, type, and position of genetic polymorphisms-that are available for any organism. The extent to which the different features influence the regulatory potential is learned automatically, making Lirnet readily applicable to different datasets, organisms, and feature sets. We apply Lirnet both to the human HapMap eQTL dataset and to a yeast eQTL dataset and provide statistical and biological results demonstrating that Lirnet produces significantly better regulatory programs than other recent approaches. We demonstrate in the yeast data that Lirnet can correctly suggest a specific causal sequence variation within a large, linked chromosomal region. In one example, Lirnet uncovered a novel, experimentally validated connection between Puf3-a sequence-specific RNA binding protein-and P-bodies-cytoplasmic structures that regulate translation and RNA stability-as well as the particular causative polymorphism, a SNP in Mkt1, that induces the variation in the pathway.http://europepmc.org/articles/PMC2627940?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Su-In Lee
Aimée M Dudley
David Drubin
Pamela A Silver
Nevan J Krogan
Dana Pe'er
Daphne Koller
spellingShingle Su-In Lee
Aimée M Dudley
David Drubin
Pamela A Silver
Nevan J Krogan
Dana Pe'er
Daphne Koller
Learning a prior on regulatory potential from eQTL data.
PLoS Genetics
author_facet Su-In Lee
Aimée M Dudley
David Drubin
Pamela A Silver
Nevan J Krogan
Dana Pe'er
Daphne Koller
author_sort Su-In Lee
title Learning a prior on regulatory potential from eQTL data.
title_short Learning a prior on regulatory potential from eQTL data.
title_full Learning a prior on regulatory potential from eQTL data.
title_fullStr Learning a prior on regulatory potential from eQTL data.
title_full_unstemmed Learning a prior on regulatory potential from eQTL data.
title_sort learning a prior on regulatory potential from eqtl data.
publisher Public Library of Science (PLoS)
series PLoS Genetics
issn 1553-7390
1553-7404
publishDate 2009-01-01
description Genome-wide RNA expression data provide a detailed view of an organism's biological state; hence, a dataset measuring expression variation between genetically diverse individuals (eQTL data) may provide important insights into the genetics of complex traits. However, with data from a relatively small number of individuals, it is difficult to distinguish true causal polymorphisms from the large number of possibilities. The problem is particularly challenging in populations with significant linkage disequilibrium, where traits are often linked to large chromosomal regions containing many genes. Here, we present a novel method, Lirnet, that automatically learns a regulatory potential for each sequence polymorphism, estimating how likely it is to have a significant effect on gene expression. This regulatory potential is defined in terms of "regulatory features"-including the function of the gene and the conservation, type, and position of genetic polymorphisms-that are available for any organism. The extent to which the different features influence the regulatory potential is learned automatically, making Lirnet readily applicable to different datasets, organisms, and feature sets. We apply Lirnet both to the human HapMap eQTL dataset and to a yeast eQTL dataset and provide statistical and biological results demonstrating that Lirnet produces significantly better regulatory programs than other recent approaches. We demonstrate in the yeast data that Lirnet can correctly suggest a specific causal sequence variation within a large, linked chromosomal region. In one example, Lirnet uncovered a novel, experimentally validated connection between Puf3-a sequence-specific RNA binding protein-and P-bodies-cytoplasmic structures that regulate translation and RNA stability-as well as the particular causative polymorphism, a SNP in Mkt1, that induces the variation in the pathway.
url http://europepmc.org/articles/PMC2627940?pdf=render
work_keys_str_mv AT suinlee learningaprioronregulatorypotentialfromeqtldata
AT aimeemdudley learningaprioronregulatorypotentialfromeqtldata
AT daviddrubin learningaprioronregulatorypotentialfromeqtldata
AT pamelaasilver learningaprioronregulatorypotentialfromeqtldata
AT nevanjkrogan learningaprioronregulatorypotentialfromeqtldata
AT danapeer learningaprioronregulatorypotentialfromeqtldata
AT daphnekoller learningaprioronregulatorypotentialfromeqtldata
_version_ 1724856903681441792