Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster Strategy
The 2-oxoglutarate/Fe (II)-dependent (2OG) oxygenase superfamily is mainly responsible for protein modification, nucleic acid repair and/or modification, and fatty acid metabolism and plays important roles in cancer, cardiovascular disease, and other diseases. They are likely to become new targets f...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2021-07-01
|
Series: | Frontiers in Cell and Developmental Biology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fcell.2021.707938/full |
id |
doaj-20a134f234fc4a77bf2d87d3a961d932 |
---|---|
record_format |
Article |
spelling |
doaj-20a134f234fc4a77bf2d87d3a961d9322021-07-16T17:00:23ZengFrontiers Media S.A.Frontiers in Cell and Developmental Biology2296-634X2021-07-01910.3389/fcell.2021.707938707938Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster StrategyJian Zhou0Suling Bo1Hao Wang2Lei Zheng3Pengfei Liang4Yongchun Zuo5State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, ChinaCollege of Computer and Information, Inner Mongolia Medical University, Hohhot, ChinaState Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, ChinaState Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, ChinaState Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, ChinaState Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, ChinaThe 2-oxoglutarate/Fe (II)-dependent (2OG) oxygenase superfamily is mainly responsible for protein modification, nucleic acid repair and/or modification, and fatty acid metabolism and plays important roles in cancer, cardiovascular disease, and other diseases. They are likely to become new targets for the treatment of cancer and other diseases, so the accurate identification of 2OG oxygenases is of great significance. Many computational methods have been proposed to predict functional proteins to compensate for the time-consuming and expensive experimental identification. However, machine learning has not been applied to the study of 2OG oxygenases. In this study, we developed OGFE_RAAC, a prediction model to identify whether a protein is a 2OG oxygenase. To improve the performance of OGFE_RAAC, 673 amino acid reduction alphabets were used to determine the optimal feature representation scheme by recoding the protein sequence. The 10-fold cross-validation test showed that the accuracy of the model in identifying 2OG oxygenases is 91.04%. Besides, the independent dataset results also proved that the model has excellent generalization and robustness. It is expected to become an effective tool for the identification of 2OG oxygenases. With further research, we have also found that the function of 2OG oxygenases may be related to their polarity and hydrophobicity, which will help the follow-up study on the catalytic mechanism of 2OG oxygenases and the way they interact with the substrate. Based on the model we built, a user-friendly web server was established and can be friendly accessed at http://bioinfor.imu.edu.cn/ogferaac.https://www.frontiersin.org/articles/10.3389/fcell.2021.707938/full2-oxoglutarate/Fe (II)-dependent oxygenasereduced amino acid clustermachine learninganovaincremental feature selection10-fold cross-validation test |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jian Zhou Suling Bo Hao Wang Lei Zheng Pengfei Liang Yongchun Zuo |
spellingShingle |
Jian Zhou Suling Bo Hao Wang Lei Zheng Pengfei Liang Yongchun Zuo Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster Strategy Frontiers in Cell and Developmental Biology 2-oxoglutarate/Fe (II)-dependent oxygenase reduced amino acid cluster machine learning anova incremental feature selection 10-fold cross-validation test |
author_facet |
Jian Zhou Suling Bo Hao Wang Lei Zheng Pengfei Liang Yongchun Zuo |
author_sort |
Jian Zhou |
title |
Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster Strategy |
title_short |
Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster Strategy |
title_full |
Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster Strategy |
title_fullStr |
Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster Strategy |
title_full_unstemmed |
Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster Strategy |
title_sort |
identification of disease-related 2-oxoglutarate/fe (ii)-dependent oxygenase based on reduced amino acid cluster strategy |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Cell and Developmental Biology |
issn |
2296-634X |
publishDate |
2021-07-01 |
description |
The 2-oxoglutarate/Fe (II)-dependent (2OG) oxygenase superfamily is mainly responsible for protein modification, nucleic acid repair and/or modification, and fatty acid metabolism and plays important roles in cancer, cardiovascular disease, and other diseases. They are likely to become new targets for the treatment of cancer and other diseases, so the accurate identification of 2OG oxygenases is of great significance. Many computational methods have been proposed to predict functional proteins to compensate for the time-consuming and expensive experimental identification. However, machine learning has not been applied to the study of 2OG oxygenases. In this study, we developed OGFE_RAAC, a prediction model to identify whether a protein is a 2OG oxygenase. To improve the performance of OGFE_RAAC, 673 amino acid reduction alphabets were used to determine the optimal feature representation scheme by recoding the protein sequence. The 10-fold cross-validation test showed that the accuracy of the model in identifying 2OG oxygenases is 91.04%. Besides, the independent dataset results also proved that the model has excellent generalization and robustness. It is expected to become an effective tool for the identification of 2OG oxygenases. With further research, we have also found that the function of 2OG oxygenases may be related to their polarity and hydrophobicity, which will help the follow-up study on the catalytic mechanism of 2OG oxygenases and the way they interact with the substrate. Based on the model we built, a user-friendly web server was established and can be friendly accessed at http://bioinfor.imu.edu.cn/ogferaac. |
topic |
2-oxoglutarate/Fe (II)-dependent oxygenase reduced amino acid cluster machine learning anova incremental feature selection 10-fold cross-validation test |
url |
https://www.frontiersin.org/articles/10.3389/fcell.2021.707938/full |
work_keys_str_mv |
AT jianzhou identificationofdiseaserelated2oxoglutaratefeiidependentoxygenasebasedonreducedaminoacidclusterstrategy AT sulingbo identificationofdiseaserelated2oxoglutaratefeiidependentoxygenasebasedonreducedaminoacidclusterstrategy AT haowang identificationofdiseaserelated2oxoglutaratefeiidependentoxygenasebasedonreducedaminoacidclusterstrategy AT leizheng identificationofdiseaserelated2oxoglutaratefeiidependentoxygenasebasedonreducedaminoacidclusterstrategy AT pengfeiliang identificationofdiseaserelated2oxoglutaratefeiidependentoxygenasebasedonreducedaminoacidclusterstrategy AT yongchunzuo identificationofdiseaserelated2oxoglutaratefeiidependentoxygenasebasedonreducedaminoacidclusterstrategy |
_version_ |
1721297512479850496 |