Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models
Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models i...
Main Authors: | , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2020-12-01
|
Series: | Frontiers in Pharmacology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fphar.2020.565644/full |
id |
doaj-20c7f554a2f7442e8fa7b4cbcaa41469 |
---|---|
record_format |
Article |
spelling |
doaj-20c7f554a2f7442e8fa7b4cbcaa414692020-12-18T06:30:50ZengFrontiers Media S.A.Frontiers in Pharmacology1663-98122020-12-011110.3389/fphar.2020.565644565644Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation ModelsDaniil Polykovskiy0Alexander Zhebrak1Benjamin Sanchez-Lengeling2Sergey Golovanov3Oktai Tatanov4Stanislav Belyaev5Rauf Kurbanov6Aleksey Artamonov7Vladimir Aladinskiy8Mark Veselov9Artur Kadurin10Simon Johansson11Hongming Chen12Sergey Nikolenko13Sergey Nikolenko14Sergey Nikolenko15Alán Aspuru-Guzik16Alán Aspuru-Guzik17Alán Aspuru-Guzik18Alán Aspuru-Guzik19Alex Zhavoronkov20Insilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong KongInsilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong KongChemistry and Chemical Biology Department, Harvard University, Cambridge, MA, United StatesNeuromation OU, Tallinn, EstoniaNeuromation OU, Tallinn, EstoniaNeuromation OU, Tallinn, EstoniaNeuromation OU, Tallinn, EstoniaNeuromation OU, Tallinn, EstoniaInsilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong KongInsilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong KongInsilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong KongMolecular AI, DiscoverySciences, R&D, AstraZeneca, Gothenburg, SwedenMolecular AI, DiscoverySciences, R&D, AstraZeneca, Gothenburg, SwedenInsilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong KongNeuromation OU, Tallinn, EstoniaComputer Science Department, National Research University Higher School of Economics, St. Petersburg, RussiaChemical Physics Theory Group, Department of Chemistry, University of Toronto, Toronto, ON, CanadaDepartment of Computer Science, University of Toronto, Toronto, ON, CanadaCIFAR AI Chair, Vector Institute for Artificial Intelligence, Toronto, ON, CanadaLebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Toronto, ON, CanadaInsilico Medicine Hong Kong Ltd., Pak Shek Kok, Hong KongGenerative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare and rank them. In this work, we introduce a benchmarking platform called Molecular Sets (MOSES) to standardize training and comparison of molecular generative models. MOSES provides training and testing datasets, and a set of metrics to evaluate the quality and diversity of generated structures. We have implemented and compared several molecular generation models and suggest to use our results as reference points for further advancements in generative chemistry research. The platform and source code are available at https://github.com/molecularsets/moses.https://www.frontiersin.org/articles/10.3389/fphar.2020.565644/fullgenerative modelsdrug discoverydeep learningbenchmarkdistribution learning |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Daniil Polykovskiy Alexander Zhebrak Benjamin Sanchez-Lengeling Sergey Golovanov Oktai Tatanov Stanislav Belyaev Rauf Kurbanov Aleksey Artamonov Vladimir Aladinskiy Mark Veselov Artur Kadurin Simon Johansson Hongming Chen Sergey Nikolenko Sergey Nikolenko Sergey Nikolenko Alán Aspuru-Guzik Alán Aspuru-Guzik Alán Aspuru-Guzik Alán Aspuru-Guzik Alex Zhavoronkov |
spellingShingle |
Daniil Polykovskiy Alexander Zhebrak Benjamin Sanchez-Lengeling Sergey Golovanov Oktai Tatanov Stanislav Belyaev Rauf Kurbanov Aleksey Artamonov Vladimir Aladinskiy Mark Veselov Artur Kadurin Simon Johansson Hongming Chen Sergey Nikolenko Sergey Nikolenko Sergey Nikolenko Alán Aspuru-Guzik Alán Aspuru-Guzik Alán Aspuru-Guzik Alán Aspuru-Guzik Alex Zhavoronkov Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models Frontiers in Pharmacology generative models drug discovery deep learning benchmark distribution learning |
author_facet |
Daniil Polykovskiy Alexander Zhebrak Benjamin Sanchez-Lengeling Sergey Golovanov Oktai Tatanov Stanislav Belyaev Rauf Kurbanov Aleksey Artamonov Vladimir Aladinskiy Mark Veselov Artur Kadurin Simon Johansson Hongming Chen Sergey Nikolenko Sergey Nikolenko Sergey Nikolenko Alán Aspuru-Guzik Alán Aspuru-Guzik Alán Aspuru-Guzik Alán Aspuru-Guzik Alex Zhavoronkov |
author_sort |
Daniil Polykovskiy |
title |
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models |
title_short |
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models |
title_full |
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models |
title_fullStr |
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models |
title_full_unstemmed |
Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models |
title_sort |
molecular sets (moses): a benchmarking platform for molecular generation models |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Pharmacology |
issn |
1663-9812 |
publishDate |
2020-12-01 |
description |
Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare and rank them. In this work, we introduce a benchmarking platform called Molecular Sets (MOSES) to standardize training and comparison of molecular generative models. MOSES provides training and testing datasets, and a set of metrics to evaluate the quality and diversity of generated structures. We have implemented and compared several molecular generation models and suggest to use our results as reference points for further advancements in generative chemistry research. The platform and source code are available at https://github.com/molecularsets/moses. |
topic |
generative models drug discovery deep learning benchmark distribution learning |
url |
https://www.frontiersin.org/articles/10.3389/fphar.2020.565644/full |
work_keys_str_mv |
AT daniilpolykovskiy molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT alexanderzhebrak molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT benjaminsanchezlengeling molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT sergeygolovanov molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT oktaitatanov molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT stanislavbelyaev molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT raufkurbanov molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT alekseyartamonov molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT vladimiraladinskiy molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT markveselov molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT arturkadurin molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT simonjohansson molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT hongmingchen molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT sergeynikolenko molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT sergeynikolenko molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT sergeynikolenko molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT alanaspuruguzik molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT alanaspuruguzik molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT alanaspuruguzik molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT alanaspuruguzik molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels AT alexzhavoronkov molecularsetsmosesabenchmarkingplatformformoleculargenerationmodels |
_version_ |
1724378626683568128 |