Target Prediction Model for Natural Products Using Transfer Learning

A large proportion of lead compounds are derived from natural products. However, most natural products have not been fully tested for their targets. To help resolve this problem, a model using transfer learning was built to predict targets for natural products. The model was pre-trained on a process...

Full description

Bibliographic Details
Main Authors: Bo Qiang, Junyong Lai, Hongwei Jin, Liangren Zhang, Zhenming Liu
Format: Article
Language:English
Published: MDPI AG 2021-04-01
Series:International Journal of Molecular Sciences
Subjects:
Online Access:https://www.mdpi.com/1422-0067/22/9/4632
id doaj-b92cde4363374b56a16eef9746af6fcf
record_format Article
spelling doaj-b92cde4363374b56a16eef9746af6fcf2021-04-28T23:02:36ZengMDPI AGInternational Journal of Molecular Sciences1661-65961422-00672021-04-01224632463210.3390/ijms22094632Target Prediction Model for Natural Products Using Transfer LearningBo Qiang0Junyong Lai1Hongwei Jin2Liangren Zhang3Zhenming Liu4State Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaState Key Laboratory of Natural and Biomimetic Drugs, School of Pharmaceutical Sciences, Peking University, Beijing 100191, ChinaA large proportion of lead compounds are derived from natural products. However, most natural products have not been fully tested for their targets. To help resolve this problem, a model using transfer learning was built to predict targets for natural products. The model was pre-trained on a processed ChEMBL dataset and then fine-tuned on a natural product dataset. Benefitting from transfer learning and the data balancing technique, the model achieved a highly promising area under the receiver operating characteristic curve (AUROC) score of 0.910, with limited task-related training samples. Since the embedding distribution difference is reduced, embedding space analysis demonstrates that the model’s outputs of natural products are reliable. Case studies have proved our model’s performance in drug datasets. The fine-tuned model can successfully output all the targets of 62 drugs. Compared with a previous study, our model achieved better results in terms of both AUROC validation and its success rate for obtaining active targets among the top ones. The target prediction model using transfer learning can be applied in the field of natural product-based drug discovery and has the potential to find more lead compounds or to assist researchers in drug repurposing.https://www.mdpi.com/1422-0067/22/9/4632target predictiondeep learningtransfer learningnatural product
collection DOAJ
language English
format Article
sources DOAJ
author Bo Qiang
Junyong Lai
Hongwei Jin
Liangren Zhang
Zhenming Liu
spellingShingle Bo Qiang
Junyong Lai
Hongwei Jin
Liangren Zhang
Zhenming Liu
Target Prediction Model for Natural Products Using Transfer Learning
International Journal of Molecular Sciences
target prediction
deep learning
transfer learning
natural product
author_facet Bo Qiang
Junyong Lai
Hongwei Jin
Liangren Zhang
Zhenming Liu
author_sort Bo Qiang
title Target Prediction Model for Natural Products Using Transfer Learning
title_short Target Prediction Model for Natural Products Using Transfer Learning
title_full Target Prediction Model for Natural Products Using Transfer Learning
title_fullStr Target Prediction Model for Natural Products Using Transfer Learning
title_full_unstemmed Target Prediction Model for Natural Products Using Transfer Learning
title_sort target prediction model for natural products using transfer learning
publisher MDPI AG
series International Journal of Molecular Sciences
issn 1661-6596
1422-0067
publishDate 2021-04-01
description A large proportion of lead compounds are derived from natural products. However, most natural products have not been fully tested for their targets. To help resolve this problem, a model using transfer learning was built to predict targets for natural products. The model was pre-trained on a processed ChEMBL dataset and then fine-tuned on a natural product dataset. Benefitting from transfer learning and the data balancing technique, the model achieved a highly promising area under the receiver operating characteristic curve (AUROC) score of 0.910, with limited task-related training samples. Since the embedding distribution difference is reduced, embedding space analysis demonstrates that the model’s outputs of natural products are reliable. Case studies have proved our model’s performance in drug datasets. The fine-tuned model can successfully output all the targets of 62 drugs. Compared with a previous study, our model achieved better results in terms of both AUROC validation and its success rate for obtaining active targets among the top ones. The target prediction model using transfer learning can be applied in the field of natural product-based drug discovery and has the potential to find more lead compounds or to assist researchers in drug repurposing.
topic target prediction
deep learning
transfer learning
natural product
url https://www.mdpi.com/1422-0067/22/9/4632
work_keys_str_mv AT boqiang targetpredictionmodelfornaturalproductsusingtransferlearning
AT junyonglai targetpredictionmodelfornaturalproductsusingtransferlearning
AT hongweijin targetpredictionmodelfornaturalproductsusingtransferlearning
AT liangrenzhang targetpredictionmodelfornaturalproductsusingtransferlearning
AT zhenmingliu targetpredictionmodelfornaturalproductsusingtransferlearning
_version_ 1721503071253561344