An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms

Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which h...

Full description

Bibliographic Details
Main Authors: Hong-Li Hua, Fa-Zhan Zhang, Abraham Alemayehu Labena, Chuan Dong, Yan-Ting Jin, Feng-Biao Guo
Format: Article
Language:English
Published: Hindawi Limited 2016-01-01
Series:BioMed Research International
Online Access:http://dx.doi.org/10.1155/2016/7639397
id doaj-96691a357b5b4d97b122db4b35077e2f
record_format Article
spelling doaj-96691a357b5b4d97b122db4b35077e2f2020-11-24T23:12:51ZengHindawi LimitedBioMed Research International2314-61332314-61412016-01-01201610.1155/2016/76393977639397An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning AlgorithmsHong-Li Hua0Fa-Zhan Zhang1Abraham Alemayehu Labena2Chuan Dong3Yan-Ting Jin4Feng-Biao Guo5Center of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaInvestigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.http://dx.doi.org/10.1155/2016/7639397
collection DOAJ
language English
format Article
sources DOAJ
author Hong-Li Hua
Fa-Zhan Zhang
Abraham Alemayehu Labena
Chuan Dong
Yan-Ting Jin
Feng-Biao Guo
spellingShingle Hong-Li Hua
Fa-Zhan Zhang
Abraham Alemayehu Labena
Chuan Dong
Yan-Ting Jin
Feng-Biao Guo
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms
BioMed Research International
author_facet Hong-Li Hua
Fa-Zhan Zhang
Abraham Alemayehu Labena
Chuan Dong
Yan-Ting Jin
Feng-Biao Guo
author_sort Hong-Li Hua
title An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms
title_short An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms
title_full An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms
title_fullStr An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms
title_full_unstemmed An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms
title_sort approach for predicting essential genes using multiple homology mapping and machine learning algorithms
publisher Hindawi Limited
series BioMed Research International
issn 2314-6133
2314-6141
publishDate 2016-01-01
description Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.
url http://dx.doi.org/10.1155/2016/7639397
work_keys_str_mv AT honglihua anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT fazhanzhang anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT abrahamalemayehulabena anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT chuandong anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT yantingjin anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT fengbiaoguo anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT honglihua approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT fazhanzhang approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT abrahamalemayehulabena approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT chuandong approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT yantingjin approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
AT fengbiaoguo approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms
_version_ 1725600503098769408