Summary: | Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 123-124). === In this thesis, we will present two methods for identifying binding events in ChIP-Seq data. The motivation of this venture is to propose a complete read generating process under a probabilistic graphical model framework which will determine more accurately binding event locations and enforce alignment of events across conditions. More specifically, we will first propose the so-called Spatial Coupling method which exploits the relative positions of reads by assuming dependent assignment of events to close reads. Second, we will present the so-called Temporal Coupling method, whose goal is to align events across multiple conditions assuming that a transcription factor binds to the same genomic coordinates across conditions. We test the Spatial Coupling using toy and real data comparing it with a Simple Mixture model, where the independence assumption between reads' positions and their assignments is taken into account. We show that the latter is generally superior in terms of locating the events more accurately and more efficient in terms of running time to the proposed method. In addition, we apply Temporal Coupling to synthetic and real data and show that it achieves alignment across conditions unlike the Simple Mixture one. Furthermore, we show by using synthetic data that even if the binding events are not aligned or not present in all conditions, the algorithm still holds its alignment property and avoids calling false positive peaks in places where do not actually exist. Lastly, we demonstrate that when binding events are aligned, the spatial resolution of Temporal Coupling is better than that of the Simple Mixture model and furthermore better sensitivity and specificity are achieved. === by Georgios Papachristoudis. === S.M.
|