Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture
The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. Recent research has shown that the double helix structure of nucleotides plays an important role in improving the accuracy and interpretability of transcription factor binding sites (TFBSs). Alt...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2021-06-01
|
Series: | Molecular Therapy: Nucleic Acids |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2162253121000494 |
id |
doaj-5613196c730b47b7815fd2001efb451c |
---|---|
record_format |
Article |
spelling |
doaj-5613196c730b47b7815fd2001efb451c2021-06-05T06:08:13ZengElsevierMolecular Therapy: Nucleic Acids2162-25312021-06-0124154163Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architectureSiguo Wang0Qinhu Zhang1Zhen Shen2Ying He3Zhen-Heng Chen4Jianqiang Li5De-Shuang Huang6The Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, No. 4800 Caoan Road, Shanghai 201804, ChinaThe Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, No. 4800 Caoan Road, Shanghai 201804, China; Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Tongji University, Siping Road 1239, Shanghai 200092, China; Corresponding author: Qinhu Zhang, The Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, No. 4800 Caoan Road, Shanghai 201804, China.School of Computer and Software, Nanyang Institute of Technology, Changjiang Road 80, Nanyang, Henan 473004, ChinaThe Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, No. 4800 Caoan Road, Shanghai 201804, ChinaCollege of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, ChinaCollege of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, ChinaThe Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, No. 4800 Caoan Road, Shanghai 201804, China; Corresponding author: De-Shuang Huang, The Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, No. 4800 Caoan Road, Shanghai 201804, China.The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. Recent research has shown that the double helix structure of nucleotides plays an important role in improving the accuracy and interpretability of transcription factor binding sites (TFBSs). Although several computational methods have been designed to take both DNA sequence and DNA shape features into consideration simultaneously, how to design an efficient model is still an intractable topic. In this paper, we proposed a hybrid convolutional recurrent neural network (CNN/RNN) architecture, CRPTS, to predict TFBSs by combining DNA sequence and DNA shape features. The novelty of our proposed method relies on three critical aspects: (1) the application of a shared hybrid CNN and RNN has the ability to efficiently extract features from large-scale genomic sequences obtained by high-throughput technology; (2) the common patterns were found from DNA sequences and their corresponding DNA shape features; (3) our proposed CRPTS can capture local structural information of DNA sequences without completely relying on DNA shape data. A series of comprehensive experiments on 66 in vitro datasets derived from universal protein binding microarrays (uPBMs) shows that our proposed method CRPTS obviously outperforms the state-of-the-art methods.http://www.sciencedirect.com/science/article/pii/S2162253121000494transcription factor binding sitesDNA sequenceDNA shape featureshybrid convolutional neural networkrecurrent neural network |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Siguo Wang Qinhu Zhang Zhen Shen Ying He Zhen-Heng Chen Jianqiang Li De-Shuang Huang |
spellingShingle |
Siguo Wang Qinhu Zhang Zhen Shen Ying He Zhen-Heng Chen Jianqiang Li De-Shuang Huang Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture Molecular Therapy: Nucleic Acids transcription factor binding sites DNA sequence DNA shape features hybrid convolutional neural network recurrent neural network |
author_facet |
Siguo Wang Qinhu Zhang Zhen Shen Ying He Zhen-Heng Chen Jianqiang Li De-Shuang Huang |
author_sort |
Siguo Wang |
title |
Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture |
title_short |
Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture |
title_full |
Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture |
title_fullStr |
Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture |
title_full_unstemmed |
Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture |
title_sort |
predicting transcription factor binding sites using dna shape features based on shared hybrid deep learning architecture |
publisher |
Elsevier |
series |
Molecular Therapy: Nucleic Acids |
issn |
2162-2531 |
publishDate |
2021-06-01 |
description |
The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. Recent research has shown that the double helix structure of nucleotides plays an important role in improving the accuracy and interpretability of transcription factor binding sites (TFBSs). Although several computational methods have been designed to take both DNA sequence and DNA shape features into consideration simultaneously, how to design an efficient model is still an intractable topic. In this paper, we proposed a hybrid convolutional recurrent neural network (CNN/RNN) architecture, CRPTS, to predict TFBSs by combining DNA sequence and DNA shape features. The novelty of our proposed method relies on three critical aspects: (1) the application of a shared hybrid CNN and RNN has the ability to efficiently extract features from large-scale genomic sequences obtained by high-throughput technology; (2) the common patterns were found from DNA sequences and their corresponding DNA shape features; (3) our proposed CRPTS can capture local structural information of DNA sequences without completely relying on DNA shape data. A series of comprehensive experiments on 66 in vitro datasets derived from universal protein binding microarrays (uPBMs) shows that our proposed method CRPTS obviously outperforms the state-of-the-art methods. |
topic |
transcription factor binding sites DNA sequence DNA shape features hybrid convolutional neural network recurrent neural network |
url |
http://www.sciencedirect.com/science/article/pii/S2162253121000494 |
work_keys_str_mv |
AT siguowang predictingtranscriptionfactorbindingsitesusingdnashapefeaturesbasedonsharedhybriddeeplearningarchitecture AT qinhuzhang predictingtranscriptionfactorbindingsitesusingdnashapefeaturesbasedonsharedhybriddeeplearningarchitecture AT zhenshen predictingtranscriptionfactorbindingsitesusingdnashapefeaturesbasedonsharedhybriddeeplearningarchitecture AT yinghe predictingtranscriptionfactorbindingsitesusingdnashapefeaturesbasedonsharedhybriddeeplearningarchitecture AT zhenhengchen predictingtranscriptionfactorbindingsitesusingdnashapefeaturesbasedonsharedhybriddeeplearningarchitecture AT jianqiangli predictingtranscriptionfactorbindingsitesusingdnashapefeaturesbasedonsharedhybriddeeplearningarchitecture AT deshuanghuang predictingtranscriptionfactorbindingsitesusingdnashapefeaturesbasedonsharedhybriddeeplearningarchitecture |
_version_ |
1721396710637305856 |