A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes

Identification of microRNA (miRNA) precursors has seen increased efforts in recent years. The difficulty in experimental detection of pre-miRNAs increased the usage of computational approaches. Most of these approaches rely on machine learning especially classification. In order to achieve successfu...

Full description

Bibliographic Details
Main Authors: Demirci Müşerref Duygu Saçar, Toprak Mustafa, Allmer Jens
Format: Article
Language:English
Published: De Gruyter 2016-12-01
Series:Journal of Integrative Bioinformatics
Online Access:https://doi.org/10.1515/jib-2016-303
id doaj-e714a39c489948aaa493c539b73f1e13
record_format Article
spelling doaj-e714a39c489948aaa493c539b73f1e132021-09-06T19:40:32ZengDe GruyterJournal of Integrative Bioinformatics1613-45162016-12-0113531010.1515/jib-2016-303jib-2016-303A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus GenomesDemirci Müşerref Duygu Saçar0Toprak Mustafa1Allmer Jens2Molecular Biology and Genetics, Izmir Institute of Technology, Urla, Izmir, TurkeyComputer Engineering, Izmir Institute of Technology, Urla, Izmir, TurkeyMolecular Biology and Genetics, Izmir Institute of Technology, Urla, Izmir, TurkeyIdentification of microRNA (miRNA) precursors has seen increased efforts in recent years. The difficulty in experimental detection of pre-miRNAs increased the usage of computational approaches. Most of these approaches rely on machine learning especially classification. In order to achieve successful classification, many parameters need to be considered such as data quality, choice of classifier settings, and feature selection. For the latter one, we developed a distributed genetic algorithm on HTCondor to perform feature selection. Moreover, we employed two widely used classification algorithms libSVM and random forest with different settings to analyze the influence on the overall classification performance. In this study we analyzed 5 human retro virus genomes; Human endogenous retrovirus K113, Hepatitis B virus (strain ayw), Human T lymphotropic virus 1, Human T lymphotropic virus 2, Human immunodeficiency virus 2, and Human immunodeficiency virus 1. We then predicted pre-miRNAs by using the information from known virus and human pre-miRNAs. Our results indicate that these viruses produce novel unknown miRNA precursors which warrant further experimental validation.https://doi.org/10.1515/jib-2016-303
collection DOAJ
language English
format Article
sources DOAJ
author Demirci Müşerref Duygu Saçar
Toprak Mustafa
Allmer Jens
spellingShingle Demirci Müşerref Duygu Saçar
Toprak Mustafa
Allmer Jens
A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes
Journal of Integrative Bioinformatics
author_facet Demirci Müşerref Duygu Saçar
Toprak Mustafa
Allmer Jens
author_sort Demirci Müşerref Duygu Saçar
title A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes
title_short A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes
title_full A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes
title_fullStr A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes
title_full_unstemmed A Machine Learning Approach for MicroRNA Precursor Prediction in Retro-transcribing Virus Genomes
title_sort machine learning approach for microrna precursor prediction in retro-transcribing virus genomes
publisher De Gruyter
series Journal of Integrative Bioinformatics
issn 1613-4516
publishDate 2016-12-01
description Identification of microRNA (miRNA) precursors has seen increased efforts in recent years. The difficulty in experimental detection of pre-miRNAs increased the usage of computational approaches. Most of these approaches rely on machine learning especially classification. In order to achieve successful classification, many parameters need to be considered such as data quality, choice of classifier settings, and feature selection. For the latter one, we developed a distributed genetic algorithm on HTCondor to perform feature selection. Moreover, we employed two widely used classification algorithms libSVM and random forest with different settings to analyze the influence on the overall classification performance. In this study we analyzed 5 human retro virus genomes; Human endogenous retrovirus K113, Hepatitis B virus (strain ayw), Human T lymphotropic virus 1, Human T lymphotropic virus 2, Human immunodeficiency virus 2, and Human immunodeficiency virus 1. We then predicted pre-miRNAs by using the information from known virus and human pre-miRNAs. Our results indicate that these viruses produce novel unknown miRNA precursors which warrant further experimental validation.
url https://doi.org/10.1515/jib-2016-303
work_keys_str_mv AT demircimuserrefduygusacar amachinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes
AT toprakmustafa amachinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes
AT allmerjens amachinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes
AT demircimuserrefduygusacar machinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes
AT toprakmustafa machinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes
AT allmerjens machinelearningapproachformicrornaprecursorpredictioninretrotranscribingvirusgenomes
_version_ 1717768300156420096