An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms
Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which h...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2016-01-01
|
Series: | BioMed Research International |
Online Access: | http://dx.doi.org/10.1155/2016/7639397 |
id |
doaj-96691a357b5b4d97b122db4b35077e2f |
---|---|
record_format |
Article |
spelling |
doaj-96691a357b5b4d97b122db4b35077e2f2020-11-24T23:12:51ZengHindawi LimitedBioMed Research International2314-61332314-61412016-01-01201610.1155/2016/76393977639397An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning AlgorithmsHong-Li Hua0Fa-Zhan Zhang1Abraham Alemayehu Labena2Chuan Dong3Yan-Ting Jin4Feng-Biao Guo5Center of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaCenter of Bioinformatics, School of Life Science and Technology, Key Laboratory for Neuroinformation of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, ChinaInvestigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.http://dx.doi.org/10.1155/2016/7639397 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hong-Li Hua Fa-Zhan Zhang Abraham Alemayehu Labena Chuan Dong Yan-Ting Jin Feng-Biao Guo |
spellingShingle |
Hong-Li Hua Fa-Zhan Zhang Abraham Alemayehu Labena Chuan Dong Yan-Ting Jin Feng-Biao Guo An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms BioMed Research International |
author_facet |
Hong-Li Hua Fa-Zhan Zhang Abraham Alemayehu Labena Chuan Dong Yan-Ting Jin Feng-Biao Guo |
author_sort |
Hong-Li Hua |
title |
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_short |
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_full |
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_fullStr |
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_full_unstemmed |
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms |
title_sort |
approach for predicting essential genes using multiple homology mapping and machine learning algorithms |
publisher |
Hindawi Limited |
series |
BioMed Research International |
issn |
2314-6133 2314-6141 |
publishDate |
2016-01-01 |
description |
Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge. |
url |
http://dx.doi.org/10.1155/2016/7639397 |
work_keys_str_mv |
AT honglihua anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT fazhanzhang anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT abrahamalemayehulabena anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT chuandong anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT yantingjin anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT fengbiaoguo anapproachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT honglihua approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT fazhanzhang approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT abrahamalemayehulabena approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT chuandong approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT yantingjin approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms AT fengbiaoguo approachforpredictingessentialgenesusingmultiplehomologymappingandmachinelearningalgorithms |
_version_ |
1725600503098769408 |