SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models

Abstract Objective To address the challenge of computational identification of cell type-specific regulatory elements on a genome-wide scale. Results We propose SeqEnhDL, a deep learning framework for classifying cell type-specific enhancers based on sequence features. DNA sequences of “strong enhan...

Full description

Bibliographic Details
Main Authors: Yupeng Wang, Rosario B. Jaime-Lara, Abhrarup Roy, Ying Sun, Xinyue Liu, Paule V. Joseph
Format: Article
Language:English
Published: BMC 2021-03-01
Series:BMC Research Notes
Subjects:
Online Access:https://doi.org/10.1186/s13104-021-05518-7
id doaj-7ee64100ada744e79c8b1136b2177f31
record_format Article
spelling doaj-7ee64100ada744e79c8b1136b2177f312021-03-21T12:44:15ZengBMCBMC Research Notes1756-05002021-03-011411710.1186/s13104-021-05518-7SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning modelsYupeng Wang0Rosario B. Jaime-Lara1Abhrarup Roy2Ying Sun3Xinyue Liu4Paule V. Joseph5BDX Research and Consulting LLCDivision of Intramural Clinical and Biological Research (DICBR), National Institute on Alcohol Abuse and Alcoholism, National Institutes of HealthDivision of Intramural Research, National Institute of Nursing Research, National Institutes of HealthBDX Research and Consulting LLCBDX Research and Consulting LLCDivision of Intramural Clinical and Biological Research (DICBR), National Institute on Alcohol Abuse and Alcoholism, National Institutes of HealthAbstract Objective To address the challenge of computational identification of cell type-specific regulatory elements on a genome-wide scale. Results We propose SeqEnhDL, a deep learning framework for classifying cell type-specific enhancers based on sequence features. DNA sequences of “strong enhancer” chromatin states in nine cell types from the ENCODE project were retrieved to build and test enhancer classifiers. For any DNA sequence, positional k-mer (k = 5, 7, 9 and 11) fold changes relative to randomly selected non-coding sequences across each nucleotide position were used as features for deep learning models. Three deep learning models were implemented, including multi-layer perceptron (MLP), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). All models in SeqEnhDL outperform state-of-the-art enhancer classifiers (including gkm-SVM and DanQ) in distinguishing cell type-specific enhancers from randomly selected non-coding sequences. Moreover, SeqEnhDL can directly discriminate enhancers from different cell types, which has not been achieved by other enhancer classifiers. Our analysis suggests that both enhancers and their tissue-specificity can be accurately identified based on their sequence features. SeqEnhDL is publicly available at https://github.com/wyp1125/SeqEnhDL .https://doi.org/10.1186/s13104-021-05518-7EnhancerClassificationDeep learningDNA sequenceCell type
collection DOAJ
language English
format Article
sources DOAJ
author Yupeng Wang
Rosario B. Jaime-Lara
Abhrarup Roy
Ying Sun
Xinyue Liu
Paule V. Joseph
spellingShingle Yupeng Wang
Rosario B. Jaime-Lara
Abhrarup Roy
Ying Sun
Xinyue Liu
Paule V. Joseph
SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models
BMC Research Notes
Enhancer
Classification
Deep learning
DNA sequence
Cell type
author_facet Yupeng Wang
Rosario B. Jaime-Lara
Abhrarup Roy
Ying Sun
Xinyue Liu
Paule V. Joseph
author_sort Yupeng Wang
title SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models
title_short SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models
title_full SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models
title_fullStr SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models
title_full_unstemmed SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models
title_sort seqenhdl: sequence-based classification of cell type-specific enhancers using deep learning models
publisher BMC
series BMC Research Notes
issn 1756-0500
publishDate 2021-03-01
description Abstract Objective To address the challenge of computational identification of cell type-specific regulatory elements on a genome-wide scale. Results We propose SeqEnhDL, a deep learning framework for classifying cell type-specific enhancers based on sequence features. DNA sequences of “strong enhancer” chromatin states in nine cell types from the ENCODE project were retrieved to build and test enhancer classifiers. For any DNA sequence, positional k-mer (k = 5, 7, 9 and 11) fold changes relative to randomly selected non-coding sequences across each nucleotide position were used as features for deep learning models. Three deep learning models were implemented, including multi-layer perceptron (MLP), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). All models in SeqEnhDL outperform state-of-the-art enhancer classifiers (including gkm-SVM and DanQ) in distinguishing cell type-specific enhancers from randomly selected non-coding sequences. Moreover, SeqEnhDL can directly discriminate enhancers from different cell types, which has not been achieved by other enhancer classifiers. Our analysis suggests that both enhancers and their tissue-specificity can be accurately identified based on their sequence features. SeqEnhDL is publicly available at https://github.com/wyp1125/SeqEnhDL .
topic Enhancer
Classification
Deep learning
DNA sequence
Cell type
url https://doi.org/10.1186/s13104-021-05518-7
work_keys_str_mv AT yupengwang seqenhdlsequencebasedclassificationofcelltypespecificenhancersusingdeeplearningmodels
AT rosariobjaimelara seqenhdlsequencebasedclassificationofcelltypespecificenhancersusingdeeplearningmodels
AT abhraruproy seqenhdlsequencebasedclassificationofcelltypespecificenhancersusingdeeplearningmodels
AT yingsun seqenhdlsequencebasedclassificationofcelltypespecificenhancersusingdeeplearningmodels
AT xinyueliu seqenhdlsequencebasedclassificationofcelltypespecificenhancersusingdeeplearningmodels
AT paulevjoseph seqenhdlsequencebasedclassificationofcelltypespecificenhancersusingdeeplearningmodels
_version_ 1724210207167348736