IIMLP: integrated information-entropy-based method for LncRNA prediction
Abstract Background The prediction of long non-coding RNA (lncRNA) has attracted great attention from researchers, as more and more evidence indicate that various complex human diseases are closely related to lncRNAs. In the era of bio-med big data, in addition to the prediction of lncRNAs by biolog...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-05-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-020-03884-w |
id |
doaj-edb4331fd50e4bb3a04827e65f243c33 |
---|---|
record_format |
Article |
spelling |
doaj-edb4331fd50e4bb3a04827e65f243c332021-05-16T11:36:17ZengBMCBMC Bioinformatics1471-21052021-05-0122S311210.1186/s12859-020-03884-wIIMLP: integrated information-entropy-based method for LncRNA predictionJunyi Li0Huinian Li1Xiao Ye2Li Zhang3Qingzhe Xu4Yuan Ping5Xiaozhu Jing6Wei Jiang7Qing Liao8Bo Liu9Yadong Wang10School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)Center for Bioinformatics, School of Computer Science and Technology, Harbin Institute of TechnologySchool of Computer Science and Technology, Harbin Institute of Technology (Shenzhen)Abstract Background The prediction of long non-coding RNA (lncRNA) has attracted great attention from researchers, as more and more evidence indicate that various complex human diseases are closely related to lncRNAs. In the era of bio-med big data, in addition to the prediction of lncRNAs by biological experimental methods, many computational methods based on machine learning have been proposed to make better use of the sequence resources of lncRNAs. Results We developed the lncRNA prediction method by integrating information-entropy-based features and machine learning algorithms. We calculate generalized topological entropy and generate 6 novel features for lncRNA sequences. By employing these 6 features and other features such as open reading frame, we apply supporting vector machine, XGBoost and random forest algorithms to distinguish human lncRNAs. We compare our method with the one which has more K-mer features and results show that our method has higher area under the curve up to 99.7905%. Conclusions We develop an accurate and efficient method which has novel information entropy features to analyze and classify lncRNAs. Our method is also extendable for research on the other functional elements in DNA sequences.https://doi.org/10.1186/s12859-020-03884-wLong non-coding RNAInformation entropyGeneralized topological entropyMachine learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Junyi Li Huinian Li Xiao Ye Li Zhang Qingzhe Xu Yuan Ping Xiaozhu Jing Wei Jiang Qing Liao Bo Liu Yadong Wang |
spellingShingle |
Junyi Li Huinian Li Xiao Ye Li Zhang Qingzhe Xu Yuan Ping Xiaozhu Jing Wei Jiang Qing Liao Bo Liu Yadong Wang IIMLP: integrated information-entropy-based method for LncRNA prediction BMC Bioinformatics Long non-coding RNA Information entropy Generalized topological entropy Machine learning |
author_facet |
Junyi Li Huinian Li Xiao Ye Li Zhang Qingzhe Xu Yuan Ping Xiaozhu Jing Wei Jiang Qing Liao Bo Liu Yadong Wang |
author_sort |
Junyi Li |
title |
IIMLP: integrated information-entropy-based method for LncRNA prediction |
title_short |
IIMLP: integrated information-entropy-based method for LncRNA prediction |
title_full |
IIMLP: integrated information-entropy-based method for LncRNA prediction |
title_fullStr |
IIMLP: integrated information-entropy-based method for LncRNA prediction |
title_full_unstemmed |
IIMLP: integrated information-entropy-based method for LncRNA prediction |
title_sort |
iimlp: integrated information-entropy-based method for lncrna prediction |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2021-05-01 |
description |
Abstract Background The prediction of long non-coding RNA (lncRNA) has attracted great attention from researchers, as more and more evidence indicate that various complex human diseases are closely related to lncRNAs. In the era of bio-med big data, in addition to the prediction of lncRNAs by biological experimental methods, many computational methods based on machine learning have been proposed to make better use of the sequence resources of lncRNAs. Results We developed the lncRNA prediction method by integrating information-entropy-based features and machine learning algorithms. We calculate generalized topological entropy and generate 6 novel features for lncRNA sequences. By employing these 6 features and other features such as open reading frame, we apply supporting vector machine, XGBoost and random forest algorithms to distinguish human lncRNAs. We compare our method with the one which has more K-mer features and results show that our method has higher area under the curve up to 99.7905%. Conclusions We develop an accurate and efficient method which has novel information entropy features to analyze and classify lncRNAs. Our method is also extendable for research on the other functional elements in DNA sequences. |
topic |
Long non-coding RNA Information entropy Generalized topological entropy Machine learning |
url |
https://doi.org/10.1186/s12859-020-03884-w |
work_keys_str_mv |
AT junyili iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT huinianli iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT xiaoye iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT lizhang iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT qingzhexu iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT yuanping iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT xiaozhujing iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT weijiang iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT qingliao iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT boliu iimlpintegratedinformationentropybasedmethodforlncrnaprediction AT yadongwang iimlpintegratedinformationentropybasedmethodforlncrnaprediction |
_version_ |
1721439422771101696 |