TADKB: Family classification and a knowledge base of topologically associating domains

Abstract Background Topologically associating domains (TADs) are considered the structural and functional units of the genome. However, there is a lack of an integrated resource for TADs in the literature where researchers can obtain family classifications and detailed information about TADs. Result...

Full description

Bibliographic Details
Main Authors: Tong Liu, Jacob Porter, Chenguang Zhao, Hao Zhu, Nan Wang, Zheng Sun, Yin-Yuan Mo, Zheng Wang
Format: Article
Language:English
Published: BMC 2019-03-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-019-5551-2
id doaj-c0a97680995f440eaada8e67da323c07
record_format Article
spelling doaj-c0a97680995f440eaada8e67da323c072020-11-25T00:41:00ZengBMCBMC Genomics1471-21642019-03-0120111710.1186/s12864-019-5551-2TADKB: Family classification and a knowledge base of topologically associating domainsTong Liu0Jacob Porter1Chenguang Zhao2Hao Zhu3Nan Wang4Zheng Sun5Yin-Yuan Mo6Zheng Wang7Department of Computer Science, University of MiamiSchool of Computing Sciences and Computer Engineering, University of Southern MississippiSchool of Computing Sciences and Computer Engineering, University of Southern MississippiSchool of Computing Sciences and Computer Engineering, University of Southern MississippiDepartment of Computer Science, New Jersey City UniversityDepartment of Electrical and Computer Engineering, California Baptist UniversityDepartment of Pharmacology and Toxicology, University of Mississippi Medical CenterDepartment of Computer Science, University of MiamiAbstract Background Topologically associating domains (TADs) are considered the structural and functional units of the genome. However, there is a lack of an integrated resource for TADs in the literature where researchers can obtain family classifications and detailed information about TADs. Results We built an online knowledge base TADKB integrating knowledge for TADs in eleven cell types of human and mouse. For each TAD, TADKB provides the predicted three-dimensional (3D) structures of chromosomes and TADs, and detailed annotations about the protein-coding genes and long non-coding RNAs (lncRNAs) existent in each TAD. Besides the 3D chromosomal structures inferred by population Hi-C, the single-cell haplotype-resolved chromosomal 3D structures of 17 GM12878 cells are also integrated in TADKB. A user can submit query gene/lncRNA ID/sequence to search for the TAD(s) that contain(s) the query gene or lncRNA. We also classified TADs into families. To achieve that, we used the TM-scores between reconstructed 3D structures of TADs as structural similarities and the Pearson’s correlation coefficients between the fold enrichment of chromatin states as functional similarities. All of the TADs in one cell type were clustered based on structural and functional similarities respectively using the spectral clustering algorithm with various predefined numbers of clusters. We have compared the overlapping TADs from structural and functional clusters and found that most of the TADs in the functional clusters with depleted chromatin states are clustered into one or two structural clusters. This novel finding indicates a connection between the 3D structures of TADs and their DNA functions in terms of chromatin states. Conclusion TADKB is available at http://dna.cs.miami.edu/TADKB/.http://link.springer.com/article/10.1186/s12864-019-5551-2Topologically associating domainsTADsFamily classificationSingle-cell 3D genome structuresLong non-coding RNAslncRNAs
collection DOAJ
language English
format Article
sources DOAJ
author Tong Liu
Jacob Porter
Chenguang Zhao
Hao Zhu
Nan Wang
Zheng Sun
Yin-Yuan Mo
Zheng Wang
spellingShingle Tong Liu
Jacob Porter
Chenguang Zhao
Hao Zhu
Nan Wang
Zheng Sun
Yin-Yuan Mo
Zheng Wang
TADKB: Family classification and a knowledge base of topologically associating domains
BMC Genomics
Topologically associating domains
TADs
Family classification
Single-cell 3D genome structures
Long non-coding RNAs
lncRNAs
author_facet Tong Liu
Jacob Porter
Chenguang Zhao
Hao Zhu
Nan Wang
Zheng Sun
Yin-Yuan Mo
Zheng Wang
author_sort Tong Liu
title TADKB: Family classification and a knowledge base of topologically associating domains
title_short TADKB: Family classification and a knowledge base of topologically associating domains
title_full TADKB: Family classification and a knowledge base of topologically associating domains
title_fullStr TADKB: Family classification and a knowledge base of topologically associating domains
title_full_unstemmed TADKB: Family classification and a knowledge base of topologically associating domains
title_sort tadkb: family classification and a knowledge base of topologically associating domains
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2019-03-01
description Abstract Background Topologically associating domains (TADs) are considered the structural and functional units of the genome. However, there is a lack of an integrated resource for TADs in the literature where researchers can obtain family classifications and detailed information about TADs. Results We built an online knowledge base TADKB integrating knowledge for TADs in eleven cell types of human and mouse. For each TAD, TADKB provides the predicted three-dimensional (3D) structures of chromosomes and TADs, and detailed annotations about the protein-coding genes and long non-coding RNAs (lncRNAs) existent in each TAD. Besides the 3D chromosomal structures inferred by population Hi-C, the single-cell haplotype-resolved chromosomal 3D structures of 17 GM12878 cells are also integrated in TADKB. A user can submit query gene/lncRNA ID/sequence to search for the TAD(s) that contain(s) the query gene or lncRNA. We also classified TADs into families. To achieve that, we used the TM-scores between reconstructed 3D structures of TADs as structural similarities and the Pearson’s correlation coefficients between the fold enrichment of chromatin states as functional similarities. All of the TADs in one cell type were clustered based on structural and functional similarities respectively using the spectral clustering algorithm with various predefined numbers of clusters. We have compared the overlapping TADs from structural and functional clusters and found that most of the TADs in the functional clusters with depleted chromatin states are clustered into one or two structural clusters. This novel finding indicates a connection between the 3D structures of TADs and their DNA functions in terms of chromatin states. Conclusion TADKB is available at http://dna.cs.miami.edu/TADKB/.
topic Topologically associating domains
TADs
Family classification
Single-cell 3D genome structures
Long non-coding RNAs
lncRNAs
url http://link.springer.com/article/10.1186/s12864-019-5551-2
work_keys_str_mv AT tongliu tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT jacobporter tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT chenguangzhao tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT haozhu tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT nanwang tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT zhengsun tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT yinyuanmo tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
AT zhengwang tadkbfamilyclassificationandaknowledgebaseoftopologicallyassociatingdomains
_version_ 1725287756908724224