A specialized learner for inferring structured cis-regulatory modules

<p>Abstract</p> <p>Background</p> <p>The process of transcription is controlled by systems of transcription factors, which bind to specific patterns of binding sites in the transcriptional control regions of genes, called <it>cis-regulatory modules </it>(CRM...

Full description

Bibliographic Details
Main Authors: Noto Keith, Craven Mark
Format: Article
Language:English
Published: BMC 2006-12-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/7/528
Description
Summary:<p>Abstract</p> <p>Background</p> <p>The process of transcription is controlled by systems of transcription factors, which bind to specific patterns of binding sites in the transcriptional control regions of genes, called <it>cis-regulatory modules </it>(CRMs). We present an expressive and easily comprehensible CRM representation which is capable of capturing several aspects of a CRM's structure and distinguishing between DNA sequences which do or do not contain it. We also present a learning algorithm tailored for this domain, and a novel method to avoid overfitting by controlling the expressivity of the model.</p> <p>Results</p> <p>We are able to find statistically significant CRMs more often then a current state-of-the-art approach on the same data sets. We also show experimentally that each aspect of our expressive CRM model space makes a positive contribution to the learned models on yeast and fly data.</p> <p>Conclusion</p> <p>Structural aspects are an important part of CRMs, both in terms of interpreting them biologically and learning them accurately. Source code for our algorithm is available at: <url>http://www.cs.wisc.edu/~noto/crm</url></p>
ISSN:1471-2105