RPITER: A Hierarchical Deep Learning Framework for ncRNA–Protein Interaction Prediction

Non-coding RNAs (ncRNAs) play crucial roles in multiple fundamental biological processes, such as post-transcriptional gene regulation, and are implicated in many complex human diseases. Mostly ncRNAs function by interacting with corresponding RNA-binding proteins. The research on ncRNA–pr...

Full description

Bibliographic Details
Main Authors: Cheng Peng, Siyu Han, Hui Zhang, Ying Li
Format: Article
Language:English
Published: MDPI AG 2019-03-01
Series:International Journal of Molecular Sciences
Subjects:
CNN
Online Access:http://www.mdpi.com/1422-0067/20/5/1070
id doaj-2351a308006c4cfea5e851f81fed1f48
record_format Article
spelling doaj-2351a308006c4cfea5e851f81fed1f482020-11-24T22:00:40ZengMDPI AGInternational Journal of Molecular Sciences1422-00672019-03-01205107010.3390/ijms20051070ijms20051070RPITER: A Hierarchical Deep Learning Framework for ncRNA–Protein Interaction PredictionCheng Peng0Siyu Han1Hui Zhang2Ying Li3College of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, ChinaCollege of Computer Science and Technology, Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, ChinaNon-coding RNAs (ncRNAs) play crucial roles in multiple fundamental biological processes, such as post-transcriptional gene regulation, and are implicated in many complex human diseases. Mostly ncRNAs function by interacting with corresponding RNA-binding proteins. The research on ncRNA–protein interaction is the key to understanding the function of ncRNA. However, the biological experiment techniques for identifying RNA–protein interactions (RPIs) are currently still expensive and time-consuming. Due to the complex molecular mechanism of ncRNA–protein interaction and the lack of conservation for ncRNA, especially for long ncRNA (lncRNA), the prediction of ncRNA–protein interaction is still a challenge. Deep learning-based models have become the state-of-the-art in a range of biological sequence analysis problems due to their strong power of feature learning. In this study, we proposed a hierarchical deep learning framework RPITER to predict RNA–protein interaction. For sequence coding, we improved the conjoint triad feature (CTF) coding method by complementing more primary sequence information and adding sequence structure information. For model design, RPITER employed two basic neural network architectures of convolution neural network (CNN) and stacked auto-encoder (SAE). Comprehensive experiments were performed on five benchmark datasets from PDB and NPInter databases to analyze and compare the performances of different sequence coding methods and prediction models. We found that CNN and SAE deep learning architectures have powerful fitting abilities for the k-mer features of RNA and protein sequence. The improved CTF coding method showed performance gain compared with the original CTF method. Moreover, our designed RPITER performed well in predicting RNA–protein interaction (RPI) and could outperform most of the previous methods. On five widely used RPI datasets, RPI369, RPI488, RPI1807, RPI2241 and NPInter, RPITER obtained A U C of 0.821, 0.911, 0.990, 0.957 and 0.985, respectively. The proposed RPITER could be a complementary method for predicting RPI and constructing RPI network, which would help push forward the related biological research on ncRNAs and lncRNAs.http://www.mdpi.com/1422-0067/20/5/1070ncRNA–protein interaction predictionncRNAdeep learningCNN
collection DOAJ
language English
format Article
sources DOAJ
author Cheng Peng
Siyu Han
Hui Zhang
Ying Li
spellingShingle Cheng Peng
Siyu Han
Hui Zhang
Ying Li
RPITER: A Hierarchical Deep Learning Framework for ncRNA–Protein Interaction Prediction
International Journal of Molecular Sciences
ncRNA–protein interaction prediction
ncRNA
deep learning
CNN
author_facet Cheng Peng
Siyu Han
Hui Zhang
Ying Li
author_sort Cheng Peng
title RPITER: A Hierarchical Deep Learning Framework for ncRNA–Protein Interaction Prediction
title_short RPITER: A Hierarchical Deep Learning Framework for ncRNA–Protein Interaction Prediction
title_full RPITER: A Hierarchical Deep Learning Framework for ncRNA–Protein Interaction Prediction
title_fullStr RPITER: A Hierarchical Deep Learning Framework for ncRNA–Protein Interaction Prediction
title_full_unstemmed RPITER: A Hierarchical Deep Learning Framework for ncRNA–Protein Interaction Prediction
title_sort rpiter: a hierarchical deep learning framework for ncrna–protein interaction prediction
publisher MDPI AG
series International Journal of Molecular Sciences
issn 1422-0067
publishDate 2019-03-01
description Non-coding RNAs (ncRNAs) play crucial roles in multiple fundamental biological processes, such as post-transcriptional gene regulation, and are implicated in many complex human diseases. Mostly ncRNAs function by interacting with corresponding RNA-binding proteins. The research on ncRNA–protein interaction is the key to understanding the function of ncRNA. However, the biological experiment techniques for identifying RNA–protein interactions (RPIs) are currently still expensive and time-consuming. Due to the complex molecular mechanism of ncRNA–protein interaction and the lack of conservation for ncRNA, especially for long ncRNA (lncRNA), the prediction of ncRNA–protein interaction is still a challenge. Deep learning-based models have become the state-of-the-art in a range of biological sequence analysis problems due to their strong power of feature learning. In this study, we proposed a hierarchical deep learning framework RPITER to predict RNA–protein interaction. For sequence coding, we improved the conjoint triad feature (CTF) coding method by complementing more primary sequence information and adding sequence structure information. For model design, RPITER employed two basic neural network architectures of convolution neural network (CNN) and stacked auto-encoder (SAE). Comprehensive experiments were performed on five benchmark datasets from PDB and NPInter databases to analyze and compare the performances of different sequence coding methods and prediction models. We found that CNN and SAE deep learning architectures have powerful fitting abilities for the k-mer features of RNA and protein sequence. The improved CTF coding method showed performance gain compared with the original CTF method. Moreover, our designed RPITER performed well in predicting RNA–protein interaction (RPI) and could outperform most of the previous methods. On five widely used RPI datasets, RPI369, RPI488, RPI1807, RPI2241 and NPInter, RPITER obtained A U C of 0.821, 0.911, 0.990, 0.957 and 0.985, respectively. The proposed RPITER could be a complementary method for predicting RPI and constructing RPI network, which would help push forward the related biological research on ncRNAs and lncRNAs.
topic ncRNA–protein interaction prediction
ncRNA
deep learning
CNN
url http://www.mdpi.com/1422-0067/20/5/1070
work_keys_str_mv AT chengpeng rpiterahierarchicaldeeplearningframeworkforncrnaproteininteractionprediction
AT siyuhan rpiterahierarchicaldeeplearningframeworkforncrnaproteininteractionprediction
AT huizhang rpiterahierarchicaldeeplearningframeworkforncrnaproteininteractionprediction
AT yingli rpiterahierarchicaldeeplearningframeworkforncrnaproteininteractionprediction
_version_ 1725843442336006144