Delineating the impact of machine learning elements in pre-microRNA detection

Gene regulation modulates RNA expression via transcription factors. Post-transcriptional gene regulation in turn influences the amount of protein product through, for example, microRNAs (miRNAs). Experimental establishment of miRNAs and their effects is complicated and even futile when aiming to est...

Full description

Bibliographic Details
Main Authors:	Müşerref Duygu Saçar Demirci, Jens Allmer
Format:	Article
Language:	English
Published:	PeerJ Inc. 2017-03-01
Series:	PeerJ
Subjects:	MicroRNA Machine learning Feature selection Negative dataset ML strategy Ab initio pre-miRNA detection
Online Access:	https://peerj.com/articles/3131.pdf

id	doaj-4d9e49255cc54f8fa959310149d810e0
record_format	Article
spelling	doaj-4d9e49255cc54f8fa959310149d810e02020-11-24T22:56:46ZengPeerJ Inc.PeerJ2167-83592017-03-015e313110.7717/peerj.3131Delineating the impact of machine learning elements in pre-microRNA detectionMüşerref Duygu Saçar Demirci0Jens Allmer1Department of Molecular Biology and Genetics, Izmir Institute of Technology, Urla, Izmir, TurkeyDepartment of Molecular Biology and Genetics, Izmir Institute of Technology, Urla, Izmir, TurkeyGene regulation modulates RNA expression via transcription factors. Post-transcriptional gene regulation in turn influences the amount of protein product through, for example, microRNAs (miRNAs). Experimental establishment of miRNAs and their effects is complicated and even futile when aiming to establish the entirety of miRNA target interactions. Therefore, computational approaches have been proposed. Many such tools rely on machine learning (ML) which involves example selection, feature extraction, model training, algorithm selection, and parameter optimization. Different ML algorithms have been used for model training on various example sets, more than 1,000 features describing pre-miRNAs have been proposed and different training and testing schemes have been used for model establishment. For pre-miRNA detection, negative examples cannot easily be established causing a problem for two class classification algorithms. There is also no consensus on what ML approach works best and, therefore, we set forth and established the impact of the different parts involved in ML on model performance. Furthermore, we established two new negative datasets and analyzed the impact of using them for training and testing. It was our aim to attach an order of importance to the parts involved in ML for pre-miRNA detection, but instead we found that all parts are intricately connected and their contributions cannot be easily untangled leading us to suggest that when attempting ML-based pre-miRNA detection many scenarios need to be explored.https://peerj.com/articles/3131.pdfMicroRNAMachine learningFeature selectionNegative datasetML strategyAb initio pre-miRNA detection
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Müşerref Duygu Saçar Demirci Jens Allmer
spellingShingle	Müşerref Duygu Saçar Demirci Jens Allmer Delineating the impact of machine learning elements in pre-microRNA detection PeerJ MicroRNA Machine learning Feature selection Negative dataset ML strategy Ab initio pre-miRNA detection
author_facet	Müşerref Duygu Saçar Demirci Jens Allmer
author_sort	Müşerref Duygu Saçar Demirci
title	Delineating the impact of machine learning elements in pre-microRNA detection
title_short	Delineating the impact of machine learning elements in pre-microRNA detection
title_full	Delineating the impact of machine learning elements in pre-microRNA detection
title_fullStr	Delineating the impact of machine learning elements in pre-microRNA detection
title_full_unstemmed	Delineating the impact of machine learning elements in pre-microRNA detection
title_sort	delineating the impact of machine learning elements in pre-microrna detection
publisher	PeerJ Inc.
series	PeerJ
issn	2167-8359
publishDate	2017-03-01
description	Gene regulation modulates RNA expression via transcription factors. Post-transcriptional gene regulation in turn influences the amount of protein product through, for example, microRNAs (miRNAs). Experimental establishment of miRNAs and their effects is complicated and even futile when aiming to establish the entirety of miRNA target interactions. Therefore, computational approaches have been proposed. Many such tools rely on machine learning (ML) which involves example selection, feature extraction, model training, algorithm selection, and parameter optimization. Different ML algorithms have been used for model training on various example sets, more than 1,000 features describing pre-miRNAs have been proposed and different training and testing schemes have been used for model establishment. For pre-miRNA detection, negative examples cannot easily be established causing a problem for two class classification algorithms. There is also no consensus on what ML approach works best and, therefore, we set forth and established the impact of the different parts involved in ML on model performance. Furthermore, we established two new negative datasets and analyzed the impact of using them for training and testing. It was our aim to attach an order of importance to the parts involved in ML for pre-miRNA detection, but instead we found that all parts are intricately connected and their contributions cannot be easily untangled leading us to suggest that when attempting ML-based pre-miRNA detection many scenarios need to be explored.
topic	MicroRNA Machine learning Feature selection Negative dataset ML strategy Ab initio pre-miRNA detection
url	https://peerj.com/articles/3131.pdf
work_keys_str_mv	AT muserrefduygusacardemirci delineatingtheimpactofmachinelearningelementsinpremicrornadetection AT jensallmer delineatingtheimpactofmachinelearningelementsinpremicrornadetection
_version_	1725653471618662400

Delineating the impact of machine learning elements in pre-microRNA detection

Similar Items