Machine learning screening of bile acid-binding peptides in a peptide database derived from food proteins

Abstract Bioactive peptides (BPs) are protein fragments that exhibit a wide variety of physicochemical properties, such as basic, acidic, hydrophobic, and hydrophilic properties; thus, they have the potential to interact with a variety of biomolecules, whereas neither carbohydrates nor fatty acids h...

Full description

Bibliographic Details
Main Authors: Kento Imai, Kazunori Shimizu, Hiroyuki Honda
Format: Article
Language:English
Published: Nature Publishing Group 2021-08-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-021-95461-1
id doaj-40e03eb184ff4eedb8df4a714d421521
record_format Article
spelling doaj-40e03eb184ff4eedb8df4a714d4215212021-08-15T11:24:07ZengNature Publishing GroupScientific Reports2045-23222021-08-0111111110.1038/s41598-021-95461-1Machine learning screening of bile acid-binding peptides in a peptide database derived from food proteinsKento Imai0Kazunori Shimizu1Hiroyuki Honda2Department of Biomolecular Engineering, Graduate School of Engineering, Nagoya UniversityDepartment of Biomolecular Engineering, Graduate School of Engineering, Nagoya UniversityDepartment of Biomolecular Engineering, Graduate School of Engineering, Nagoya UniversityAbstract Bioactive peptides (BPs) are protein fragments that exhibit a wide variety of physicochemical properties, such as basic, acidic, hydrophobic, and hydrophilic properties; thus, they have the potential to interact with a variety of biomolecules, whereas neither carbohydrates nor fatty acids have such diverse properties. Therefore, BP is considered to be a new generation of biologically active regulators. Recently, some BPs that have shown positive benefits in humans have been screened from edible proteins. In the present study, a new BP screening method was developed using BIOPEP-UWM and machine learning. Training data were initially obtained using high-throughput techniques, and positive and negative datasets were generated. The predictive model was generated by calculating the explanatory variables of the peptides. To understand both site-specific and global characteristics, amino acid features (for site-specific characteristics) and peptide global features (for global characteristics) were generated. The constructed models were applied to the peptide database generated using BIOPEP-UWM, and bioactivity was predicted to explore candidate bile acid-binding peptides. Using this strategy, seven novel bile acid-binding peptides (VFWM, QRIFW, RVWVQ, LIRYTK, NGDEPL, PTFTRKL, and KISQRYQ) were identified. Our novel screening method can be easily applied to industrial applications using whole edible proteins. The proposed approach would be useful for identifying bile acid-binding peptides, as well as other BPs, as long as a large amount of training data can be obtained.https://doi.org/10.1038/s41598-021-95461-1
collection DOAJ
language English
format Article
sources DOAJ
author Kento Imai
Kazunori Shimizu
Hiroyuki Honda
spellingShingle Kento Imai
Kazunori Shimizu
Hiroyuki Honda
Machine learning screening of bile acid-binding peptides in a peptide database derived from food proteins
Scientific Reports
author_facet Kento Imai
Kazunori Shimizu
Hiroyuki Honda
author_sort Kento Imai
title Machine learning screening of bile acid-binding peptides in a peptide database derived from food proteins
title_short Machine learning screening of bile acid-binding peptides in a peptide database derived from food proteins
title_full Machine learning screening of bile acid-binding peptides in a peptide database derived from food proteins
title_fullStr Machine learning screening of bile acid-binding peptides in a peptide database derived from food proteins
title_full_unstemmed Machine learning screening of bile acid-binding peptides in a peptide database derived from food proteins
title_sort machine learning screening of bile acid-binding peptides in a peptide database derived from food proteins
publisher Nature Publishing Group
series Scientific Reports
issn 2045-2322
publishDate 2021-08-01
description Abstract Bioactive peptides (BPs) are protein fragments that exhibit a wide variety of physicochemical properties, such as basic, acidic, hydrophobic, and hydrophilic properties; thus, they have the potential to interact with a variety of biomolecules, whereas neither carbohydrates nor fatty acids have such diverse properties. Therefore, BP is considered to be a new generation of biologically active regulators. Recently, some BPs that have shown positive benefits in humans have been screened from edible proteins. In the present study, a new BP screening method was developed using BIOPEP-UWM and machine learning. Training data were initially obtained using high-throughput techniques, and positive and negative datasets were generated. The predictive model was generated by calculating the explanatory variables of the peptides. To understand both site-specific and global characteristics, amino acid features (for site-specific characteristics) and peptide global features (for global characteristics) were generated. The constructed models were applied to the peptide database generated using BIOPEP-UWM, and bioactivity was predicted to explore candidate bile acid-binding peptides. Using this strategy, seven novel bile acid-binding peptides (VFWM, QRIFW, RVWVQ, LIRYTK, NGDEPL, PTFTRKL, and KISQRYQ) were identified. Our novel screening method can be easily applied to industrial applications using whole edible proteins. The proposed approach would be useful for identifying bile acid-binding peptides, as well as other BPs, as long as a large amount of training data can be obtained.
url https://doi.org/10.1038/s41598-021-95461-1
work_keys_str_mv AT kentoimai machinelearningscreeningofbileacidbindingpeptidesinapeptidedatabasederivedfromfoodproteins
AT kazunorishimizu machinelearningscreeningofbileacidbindingpeptidesinapeptidedatabasederivedfromfoodproteins
AT hiroyukihonda machinelearningscreeningofbileacidbindingpeptidesinapeptidedatabasederivedfromfoodproteins
_version_ 1721206846506663936