A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes
Identification of microRNA (miRNA) precursors has seen increased efforts in recent years. The difficulty in experimental detection of pre-miRNAs increased the usage of computational approaches. Most of these approaches rely on machine learning especially classification. In order to achieve successfu...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
De Gruyter
2016-12-01
|
Series: | Journal of Integrative Bioinformatics |
Online Access: | https://doi.org/10.1515/jib-2016-303 |
id |
doaj-e714a39c489948aaa493c539b73f1e13 |
---|---|
record_format |
Article |
spelling |
doaj-e714a39c489948aaa493c539b73f1e132021-09-06T19:40:32ZengDe GruyterJournal of Integrative Bioinformatics1613-45162016-12-0113531010.1515/jib-2016-303jib-2016-303A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus GenomesDemirci Müşerref Duygu Saçar0Toprak Mustafa1Allmer Jens2Molecular Biology and Genetics, Izmir Institute of Technology, Urla, Izmir, TurkeyComputer Engineering, Izmir Institute of Technology, Urla, Izmir, TurkeyMolecular Biology and Genetics, Izmir Institute of Technology, Urla, Izmir, TurkeyIdentification of microRNA (miRNA) precursors has seen increased efforts in recent years. The difficulty in experimental detection of pre-miRNAs increased the usage of computational approaches. Most of these approaches rely on machine learning especially classification. In order to achieve successful classification, many parameters need to be considered such as data quality, choice of classifier settings, and feature selection. For the latter one, we developed a distributed genetic algorithm on HTCondor to perform feature selection. Moreover, we employed two widely used classification algorithms libSVM and random forest with different settings to analyze the influence on the overall classification performance. In this study we analyzed 5 human retro virus genomes; Human endogenous retrovirus K113, Hepatitis B virus (strain ayw), Human T lymphotropic virus 1, Human T lymphotropic virus 2, Human immunodeficiency virus 2, and Human immunodeficiency virus 1. We then predicted pre-miRNAs by using the information from known virus and human pre-miRNAs. Our results indicate that these viruses produce novel unknown miRNA precursors which warrant further experimental validation.https://doi.org/10.1515/jib-2016-303 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Demirci Müşerref Duygu Saçar Toprak Mustafa Allmer Jens |
spellingShingle |
Demirci Müşerref Duygu Saçar Toprak Mustafa Allmer Jens A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes Journal of Integrative Bioinformatics |
author_facet |
Demirci Müşerref Duygu Saçar Toprak Mustafa Allmer Jens |
author_sort |
Demirci Müşerref Duygu Saçar |
title |
A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes |
title_short |
A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes |
title_full |
A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes |
title_fullStr |
A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes |
title_full_unstemmed |
A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes |
title_sort |
machine learning approach for microrna precursor prediction in retro-transcribing virus genomes |
publisher |
De Gruyter |
series |
Journal of Integrative Bioinformatics |
issn |
1613-4516 |
publishDate |
2016-12-01 |
description |
Identification of microRNA (miRNA) precursors has seen increased efforts in recent years. The difficulty in experimental detection of pre-miRNAs increased the usage of computational approaches. Most of these approaches rely on machine learning especially classification. In order to achieve successful classification, many parameters need to be considered such as data quality, choice of classifier settings, and feature selection. For the latter one, we developed a distributed genetic algorithm on HTCondor to perform feature selection. Moreover, we employed two widely used classification algorithms libSVM and random forest with different settings to analyze the influence on the overall classification performance. In this study we analyzed 5 human retro virus genomes; Human endogenous retrovirus K113, Hepatitis B virus (strain ayw), Human T lymphotropic virus 1, Human T lymphotropic virus 2, Human immunodeficiency virus 2, and Human immunodeficiency virus 1. We then predicted pre-miRNAs by using the information from known virus and human pre-miRNAs. Our results indicate that these viruses produce novel unknown miRNA precursors which warrant further experimental validation. |
url |
https://doi.org/10.1515/jib-2016-303 |
work_keys_str_mv |
AT demircimuserrefduygusacar amachinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes AT toprakmustafa amachinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes AT allmerjens amachinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes AT demircimuserrefduygusacar machinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes AT toprakmustafa machinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes AT allmerjens machinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes |
_version_ |
1717768300156420096 |