Investigating the Impact of Gene Cofunctionality in Predicting Gene Mutations of <italic>E. coli</italic>
Machine learning algorithms (MLAs) have recently been applied to predict gene mutations of Escherichia coli (E. coli) under different exposure conditions, with room for improvement in performance. In a bid to improve performance, we hypothesize that incorporating the interactions between genes will...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9195469/ |
id |
doaj-8a1a397044d346479cebfce983a3d36d |
---|---|
record_format |
Article |
spelling |
doaj-8a1a397044d346479cebfce983a3d36d2021-03-30T03:46:49ZengIEEEIEEE Access2169-35362020-01-01816739716741010.1109/ACCESS.2020.30236629195469Investigating the Impact of Gene Cofunctionality in Predicting Gene Mutations of <italic>E. coli</italic>Michael Okwori0https://orcid.org/0000-0002-8827-8685Ali Eslami1https://orcid.org/0000-0002-7907-1930Department of Electrical Engineering and Computer Science, Wichita State University, Wichita, KS, USADepartment of Electrical Engineering and Computer Science, Wichita State University, Wichita, KS, USAMachine learning algorithms (MLAs) have recently been applied to predict gene mutations of Escherichia coli (E. coli) under different exposure conditions, with room for improvement in performance. In a bid to improve performance, we hypothesize that incorporating the interactions between genes will help MLAs make better predictions. To investigate this, we integrated protein-coding gene cofunctional networks into a mutation dataset of E. coli exposed to different conditions. Also, we proposed a feature-selection algorithm based on gene cofunctional networks to pick the most relevant exposure conditions. Then, we used the extended dataset to train a support vector classifier, an artificial neural network, and an ensemble of both MLAs. Separate models were trained for each of the protein-coding genes. Validation results showed that our approach improved both the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision-recall curve (AUPRC). A peak increase of 8.20% in AUPRC was observed. A similar analysis on selected genes, with ten or more mutation points for each gene, also showed improvement in the general performance of the MLAs. Out-of-sample testing on adaptive laboratory evolution experiments curated from the literature provided further evidence of an enhanced mutation-prediction performance, where a maximum 8.74% boost in the AUC was observed. Finally, we highlighted the genes with the most improved and most degraded predictions due to the additional information of the cofunctional genes. This work suggests that the functional relationship between genes may play a role in gene mutation and illustrates how the relationships might help to improve mutation prediction.https://ieeexplore.ieee.org/document/9195469/Mutation prediction<italic xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">E. coli</italic> gene interactionsfeature selectionmachine learningartificial neural networksupport vector classifier |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Michael Okwori Ali Eslami |
spellingShingle |
Michael Okwori Ali Eslami Investigating the Impact of Gene Cofunctionality in Predicting Gene Mutations of <italic>E. coli</italic> IEEE Access Mutation prediction <italic xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">E. coli</italic> gene interactions feature selection machine learning artificial neural network support vector classifier |
author_facet |
Michael Okwori Ali Eslami |
author_sort |
Michael Okwori |
title |
Investigating the Impact of Gene Cofunctionality in Predicting Gene Mutations of <italic>E. coli</italic> |
title_short |
Investigating the Impact of Gene Cofunctionality in Predicting Gene Mutations of <italic>E. coli</italic> |
title_full |
Investigating the Impact of Gene Cofunctionality in Predicting Gene Mutations of <italic>E. coli</italic> |
title_fullStr |
Investigating the Impact of Gene Cofunctionality in Predicting Gene Mutations of <italic>E. coli</italic> |
title_full_unstemmed |
Investigating the Impact of Gene Cofunctionality in Predicting Gene Mutations of <italic>E. coli</italic> |
title_sort |
investigating the impact of gene cofunctionality in predicting gene mutations of <italic>e. coli</italic> |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
Machine learning algorithms (MLAs) have recently been applied to predict gene mutations of Escherichia coli (E. coli) under different exposure conditions, with room for improvement in performance. In a bid to improve performance, we hypothesize that incorporating the interactions between genes will help MLAs make better predictions. To investigate this, we integrated protein-coding gene cofunctional networks into a mutation dataset of E. coli exposed to different conditions. Also, we proposed a feature-selection algorithm based on gene cofunctional networks to pick the most relevant exposure conditions. Then, we used the extended dataset to train a support vector classifier, an artificial neural network, and an ensemble of both MLAs. Separate models were trained for each of the protein-coding genes. Validation results showed that our approach improved both the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision-recall curve (AUPRC). A peak increase of 8.20% in AUPRC was observed. A similar analysis on selected genes, with ten or more mutation points for each gene, also showed improvement in the general performance of the MLAs. Out-of-sample testing on adaptive laboratory evolution experiments curated from the literature provided further evidence of an enhanced mutation-prediction performance, where a maximum 8.74% boost in the AUC was observed. Finally, we highlighted the genes with the most improved and most degraded predictions due to the additional information of the cofunctional genes. This work suggests that the functional relationship between genes may play a role in gene mutation and illustrates how the relationships might help to improve mutation prediction. |
topic |
Mutation prediction <italic xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">E. coli</italic> gene interactions feature selection machine learning artificial neural network support vector classifier |
url |
https://ieeexplore.ieee.org/document/9195469/ |
work_keys_str_mv |
AT michaelokwori investigatingtheimpactofgenecofunctionalityinpredictinggenemutationsofitalicecoliitalic AT alieslami investigatingtheimpactofgenecofunctionalityinpredictinggenemutationsofitalicecoliitalic |
_version_ |
1724182859169660928 |