Deciphering hierarchical organization of topologically associated domains through change-point testing

Background: The nucleus of eukaryotic cells spatially packages chromosomes into a hierarchical and distinct segregation that plays critical roles in maintaining transcription regulation. High-throughput methods of chromosome conformation capture, such as Hi-C, have revealed topologically associating...

Full description

Bibliographic Details
Main Authors: Chen, Y. (Author), Wu, Y. (Author), Xing, H. (Author), Zhang, M.Q (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 03498nam a2200613Ia 4500
001 10.1186-s12859-021-04113-8
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a Deciphering hierarchical organization of topologically associated domains through change-point testing 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-021-04113-8 
520 3 |a Background: The nucleus of eukaryotic cells spatially packages chromosomes into a hierarchical and distinct segregation that plays critical roles in maintaining transcription regulation. High-throughput methods of chromosome conformation capture, such as Hi-C, have revealed topologically associating domains (TADs) that are defined by biased chromatin interactions within them. Results: We introduce a novel method, HiCKey, to decipher hierarchical TAD structures in Hi-C data and compare them across samples. We first derive a generalized likelihood-ratio (GLR) test for detecting change-points in an interaction matrix that follows a negative binomial distribution or general mixture distribution. We then employ several optimal search strategies to decipher hierarchical TADs with p values calculated by the GLR test. Large-scale validations of simulation data show that HiCKey has good precision in recalling known TADs and is robust against random collisions of chromatin interactions. By applying HiCKey to Hi-C data of seven human cell lines, we identified multiple layers of TAD organization among them, but the vast majority had no more than four layers. In particular, we found that TAD boundaries are significantly enriched in active chromosomal regions compared to repressed regions. Conclusions: HiCKey is optimized for processing large matrices constructed from high-resolution Hi-C experiments. The method and theoretical result of the GLR test provide a general framework for significance testing of similar experimental chromatin interaction data that may not fully follow negative binomial distributions but rather more general mixture distributions. © 2021, The Author(s). 
650 0 4 |a article 
650 0 4 |a binomial distribution 
650 0 4 |a Cell culture 
650 0 4 |a cell nucleus 
650 0 4 |a Cell Nucleus 
650 0 4 |a Change-points 
650 0 4 |a chromatin 
650 0 4 |a Chromatin 
650 0 4 |a Chromatin interaction 
650 0 4 |a chromosome 
650 0 4 |a chromosome 
650 0 4 |a Chromosomes 
650 0 4 |a computer simulation 
650 0 4 |a Computer Simulation 
650 0 4 |a controlled study 
650 0 4 |a gene expression regulation 
650 0 4 |a Gene Expression Regulation 
650 0 4 |a Generalized Likelihood Ratio Test 
650 0 4 |a Generalized likelihood-ratio test 
650 0 4 |a genetics 
650 0 4 |a Hi-C data 
650 0 4 |a Hierarchical organizations 
650 0 4 |a Hierarchical TADs 
650 0 4 |a High-throughput method 
650 0 4 |a human 
650 0 4 |a human cell 
650 0 4 |a Humans 
650 0 4 |a Matrix algebra 
650 0 4 |a Mixture distributions 
650 0 4 |a Mixtures 
650 0 4 |a Negative binomial distribution 
650 0 4 |a Optimal search strategy 
650 0 4 |a Significance testing 
650 0 4 |a simulation 
650 0 4 |a Topology 
650 0 4 |a Transcription 
650 0 4 |a Transcription regulations 
700 1 |a Chen, Y.  |e author 
700 1 |a Wu, Y.  |e author 
700 1 |a Xing, H.  |e author 
700 1 |a Zhang, M.Q.  |e author 
773 |t BMC Bioinformatics