Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.

Oligonucleotide-based aptamers, which have a three-dimensional structure with a single-stranded fragment, feature various characteristics with respect to size, toxicity, and permeability. Accordingly, aptamers are advantageous in terms of diagnosis and treatment and are materials that can be produce...

Full description

Bibliographic Details
Main Authors: Gwangho Lee, Gun Hyuk Jang, Ho Young Kang, Giltae Song
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2021-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0253760
id doaj-51ef32496859423ea43c1f84b68ec2eb
record_format Article
spelling doaj-51ef32496859423ea43c1f84b68ec2eb2021-07-10T04:30:30ZengPublic Library of Science (PLoS)PLoS ONE1932-62032021-01-01166e025376010.1371/journal.pone.0253760Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.Gwangho LeeGun Hyuk JangHo Young KangGiltae SongOligonucleotide-based aptamers, which have a three-dimensional structure with a single-stranded fragment, feature various characteristics with respect to size, toxicity, and permeability. Accordingly, aptamers are advantageous in terms of diagnosis and treatment and are materials that can be produced through relatively simple experiments. Systematic evolution of ligands by exponential enrichment (SELEX) is one of the most widely used experimental methods for generating aptamers; however, it is highly expensive and time-consuming. To reduce the related costs, recent studies have used in silico approaches, such as aptamer-protein interaction (API) classifiers that use sequence patterns to determine the binding affinity between RNA aptamers and proteins. Some of these methods generate candidate RNA aptamer sequences that bind to a target protein, but they are limited to producing candidates of a specific size. In this study, we present a machine learning approach for selecting candidate sequences of various sizes that have a high binding affinity for a specific sequence of a target protein. We applied the Monte Carlo tree search (MCTS) algorithm for generating the candidate sequences using a score function based on an API classifier. The tree structure that we designed with MCTS enables nucleotide sequence sampling, and the obtained sequences are potential aptamer candidates. We performed a quality assessment using the scores of docking simulations. Our validation datasets revealed that our model showed similar or better docking scores in ZDOCK docking simulations than the known aptamers. We expect that our method, which is size-independent and easy to use, can provide insights into searching for an appropriate aptamer sequence for a target protein during the simulation step of SELEX.https://doi.org/10.1371/journal.pone.0253760
collection DOAJ
language English
format Article
sources DOAJ
author Gwangho Lee
Gun Hyuk Jang
Ho Young Kang
Giltae Song
spellingShingle Gwangho Lee
Gun Hyuk Jang
Ho Young Kang
Giltae Song
Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.
PLoS ONE
author_facet Gwangho Lee
Gun Hyuk Jang
Ho Young Kang
Giltae Song
author_sort Gwangho Lee
title Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.
title_short Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.
title_full Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.
title_fullStr Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.
title_full_unstemmed Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach.
title_sort predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a monte carlo tree search approach.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2021-01-01
description Oligonucleotide-based aptamers, which have a three-dimensional structure with a single-stranded fragment, feature various characteristics with respect to size, toxicity, and permeability. Accordingly, aptamers are advantageous in terms of diagnosis and treatment and are materials that can be produced through relatively simple experiments. Systematic evolution of ligands by exponential enrichment (SELEX) is one of the most widely used experimental methods for generating aptamers; however, it is highly expensive and time-consuming. To reduce the related costs, recent studies have used in silico approaches, such as aptamer-protein interaction (API) classifiers that use sequence patterns to determine the binding affinity between RNA aptamers and proteins. Some of these methods generate candidate RNA aptamer sequences that bind to a target protein, but they are limited to producing candidates of a specific size. In this study, we present a machine learning approach for selecting candidate sequences of various sizes that have a high binding affinity for a specific sequence of a target protein. We applied the Monte Carlo tree search (MCTS) algorithm for generating the candidate sequences using a score function based on an API classifier. The tree structure that we designed with MCTS enables nucleotide sequence sampling, and the obtained sequences are potential aptamer candidates. We performed a quality assessment using the scores of docking simulations. Our validation datasets revealed that our model showed similar or better docking scores in ZDOCK docking simulations than the known aptamers. We expect that our method, which is size-independent and easy to use, can provide insights into searching for an appropriate aptamer sequence for a target protein during the simulation step of SELEX.
url https://doi.org/10.1371/journal.pone.0253760
work_keys_str_mv AT gwangholee predictingaptamersequencesthatinteractwithtargetproteinsusinganaptamerproteininteractionclassifierandamontecarlotreesearchapproach
AT gunhyukjang predictingaptamersequencesthatinteractwithtargetproteinsusinganaptamerproteininteractionclassifierandamontecarlotreesearchapproach
AT hoyoungkang predictingaptamersequencesthatinteractwithtargetproteinsusinganaptamerproteininteractionclassifierandamontecarlotreesearchapproach
AT giltaesong predictingaptamersequencesthatinteractwithtargetproteinsusinganaptamerproteininteractionclassifierandamontecarlotreesearchapproach
_version_ 1721310032222486528