Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep Learning
In this research, a process for developing normal-phase liquid chromatography solvent systems has been proposed. In contrast to the development of conditions via thin-layer chromatography (TLC), this process is based on the architecture of two hierarchically connected neural network-based components...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-04-01
|
Series: | Molecules |
Subjects: | |
Online Access: | https://www.mdpi.com/1420-3049/26/9/2474 |
id |
doaj-ee0408718f4e45a29865908a359620e0 |
---|---|
record_format |
Article |
spelling |
doaj-ee0408718f4e45a29865908a359620e02021-04-23T23:05:28ZengMDPI AGMolecules1420-30492021-04-01262474247410.3390/molecules26092474Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep LearningMantas Vaškevičius0Jurgita Kapočiūtė-Dzikienė1Liudas Šlepikas2Department of Applied Informatics, Vytautas Magnus University, LT-44404 Kaunas, LithuaniaDepartment of Applied Informatics, Vytautas Magnus University, LT-44404 Kaunas, LithuaniaJSC Synhet, Biržų Str. 6, LT-44139 Kaunas, LithuaniaIn this research, a process for developing normal-phase liquid chromatography solvent systems has been proposed. In contrast to the development of conditions via thin-layer chromatography (TLC), this process is based on the architecture of two hierarchically connected neural network-based components. Using a large database of reaction procedures allows those two components to perform an essential role in the machine-learning-based prediction of chromatographic purification conditions, i.e., solvents and the ratio between solvents. In our paper, we build two datasets and test various molecular vectorization approaches, such as extended-connectivity fingerprints, learned embedding, and auto-encoders along with different types of deep neural networks to demonstrate a novel method for modeling chromatographic solvent systems employing two neural networks in sequence. Afterward, we present our findings and provide insights on the most effective methods for solving prediction tasks. Our approach results in a system of two neural networks with long short-term memory (LSTM)-based auto-encoders, where the first predicts solvent labels (by reaching the classification accuracy of 0.950 ± 0.001) and in the case of two solvents, the second one predicts the ratio between two solvents (R<sup>2</sup> metric equal to 0.982 ± 0.001). Our approach can be used as a guidance instrument in laboratories to accelerate scouting for suitable chromatography conditions.https://www.mdpi.com/1420-3049/26/9/2474deep learningchromatographyneural networksmachine learningsolvent predictionorganic synthesis |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Mantas Vaškevičius Jurgita Kapočiūtė-Dzikienė Liudas Šlepikas |
spellingShingle |
Mantas Vaškevičius Jurgita Kapočiūtė-Dzikienė Liudas Šlepikas Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep Learning Molecules deep learning chromatography neural networks machine learning solvent prediction organic synthesis |
author_facet |
Mantas Vaškevičius Jurgita Kapočiūtė-Dzikienė Liudas Šlepikas |
author_sort |
Mantas Vaškevičius |
title |
Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep Learning |
title_short |
Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep Learning |
title_full |
Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep Learning |
title_fullStr |
Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep Learning |
title_full_unstemmed |
Prediction of Chromatography Conditions for Purification in Organic Synthesis Using Deep Learning |
title_sort |
prediction of chromatography conditions for purification in organic synthesis using deep learning |
publisher |
MDPI AG |
series |
Molecules |
issn |
1420-3049 |
publishDate |
2021-04-01 |
description |
In this research, a process for developing normal-phase liquid chromatography solvent systems has been proposed. In contrast to the development of conditions via thin-layer chromatography (TLC), this process is based on the architecture of two hierarchically connected neural network-based components. Using a large database of reaction procedures allows those two components to perform an essential role in the machine-learning-based prediction of chromatographic purification conditions, i.e., solvents and the ratio between solvents. In our paper, we build two datasets and test various molecular vectorization approaches, such as extended-connectivity fingerprints, learned embedding, and auto-encoders along with different types of deep neural networks to demonstrate a novel method for modeling chromatographic solvent systems employing two neural networks in sequence. Afterward, we present our findings and provide insights on the most effective methods for solving prediction tasks. Our approach results in a system of two neural networks with long short-term memory (LSTM)-based auto-encoders, where the first predicts solvent labels (by reaching the classification accuracy of 0.950 ± 0.001) and in the case of two solvents, the second one predicts the ratio between two solvents (R<sup>2</sup> metric equal to 0.982 ± 0.001). Our approach can be used as a guidance instrument in laboratories to accelerate scouting for suitable chromatography conditions. |
topic |
deep learning chromatography neural networks machine learning solvent prediction organic synthesis |
url |
https://www.mdpi.com/1420-3049/26/9/2474 |
work_keys_str_mv |
AT mantasvaskevicius predictionofchromatographyconditionsforpurificationinorganicsynthesisusingdeeplearning AT jurgitakapociutedzikiene predictionofchromatographyconditionsforpurificationinorganicsynthesisusingdeeplearning AT liudasslepikas predictionofchromatographyconditionsforpurificationinorganicsynthesisusingdeeplearning |
_version_ |
1721512112632627200 |