Computational analysis of transcriptional regulation from local sequence features to three dimensional chromatin domains

Regulation of gene expression spans different levels of complexity: from genomic sequence, transcription factor binding and epigenetics, to three-dimensional chromatin interactions. Data from different individuals such as genetic variations presents an extra dimension to consider. Abnormal activitie...

Full description

Bibliographic Details
Main Author: Chen, Julie Chih-yu
Language:English
Published: University of British Columbia 2016
Online Access:http://hdl.handle.net/2429/59384
Description
Summary:Regulation of gene expression spans different levels of complexity: from genomic sequence, transcription factor binding and epigenetics, to three-dimensional chromatin interactions. Data from different individuals such as genetic variations presents an extra dimension to consider. Abnormal activities at any level may lead to disease phenotypes, motivating deeper exploration of gene regulation. New high-throughput sequencing techniques have empowered genome-wide studies of the regulatory mechanisms within cells. This thesis uses computational approaches to examine gene regulation with high-throughput data in order to address biological hypotheses traversing from short local sequence features to megabase-sized topologically associating domains (TADs). The hypotheses addressed in the thesis have two central themes: 1) the elucidation of local and domain regulation of gene expression, and 2) the application of such knowledge to identify functional phenotypic variants. We developed a computational approach to identify functional variants associated with cancer, and demonstrated how annotating regulatory sequences and linking these regions to target genes can strengthen genome interpretation. The concurrent and intertwined nature of local and domain regulation of gene expression develops as the thesis unfolds. In a study of genes that escape from X-chromosome inactivation, we found the YY1 transcription factor to be a key regulator, and is potentially associated with long distance chromatin looping mechanisms. Similarly, when studying the spread of inactivation to the autosomes in translocated cells, we detected local features associated with inactivation status, and at the domain level, we observed the spreading to be in accordance with TADs. Lastly, when considering TADs as transcriptional units, the identification of cell type-selectively co-expressed and co-localized TADs highlighted an organized and dynamic chromatin architecture across multiple cell types. In summary, this thesis provides insights into the mechanisms involved in gene expression across multiple scales (from local sequences to chromatin domains) using computational analyses on publicly available datasets. The presented methods and results have potential applications to interpret genetic variations and further our understanding in diseases and phenotypes. The findings may contribute to an era of preventative and regenerative medicine to come. === Science, Faculty of === Graduate