Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing Model

We propose a chemical language processing model to predict polymers’ glass transition temperature (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>g</mi></msub>&...

Full description

Bibliographic Details
Main Authors: Guang Chen, Lei Tao, Ying Li
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Polymers
Subjects:
Online Access:https://www.mdpi.com/2073-4360/13/11/1898
id doaj-6382405dfbc94761907739242df058c9
record_format Article
spelling doaj-6382405dfbc94761907739242df058c92021-06-30T23:33:51ZengMDPI AGPolymers2073-43602021-06-01131898189810.3390/polym13111898Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing ModelGuang Chen0Lei Tao1Ying Li2Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USADepartment of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USADepartment of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USAWe propose a chemical language processing model to predict polymers’ glass transition temperature (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>g</mi></msub></semantics></math></inline-formula>) through a polymer language (SMILES, Simplified Molecular Input Line Entry System) embedding and recurrent neural network. This model only receives the SMILES strings of a polymer’s repeat units as inputs and considers the SMILES strings as sequential data at the character level. Using this method, there is no need to calculate any additional molecular descriptors or fingerprints of polymers, and thereby, being very computationally efficient. More importantly, it avoids the difficulties to generate molecular descriptors for repeat units containing polymerization point ‘*’. Results show that the trained model demonstrates reasonable prediction performance on unseen polymer’s <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>g</mi></msub></semantics></math></inline-formula>. Besides, this model is further applied for high-throughput screening on an unlabeled polymer database to identify high-temperature polymers that are desired for applications in extreme environments. Our work demonstrates that the SMILES strings of polymer repeat units can be used as an effective feature representation to develop a chemical language processing model for predictions of polymer <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>g</mi></msub></semantics></math></inline-formula>. The framework of this model is general and can be used to construct structure–property relationships for other polymer properties.https://www.mdpi.com/2073-4360/13/11/1898polymer informaticsmachine learningglass transition temperaturehigh-throughput screeningrecurrent neural network
collection DOAJ
language English
format Article
sources DOAJ
author Guang Chen
Lei Tao
Ying Li
spellingShingle Guang Chen
Lei Tao
Ying Li
Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing Model
Polymers
polymer informatics
machine learning
glass transition temperature
high-throughput screening
recurrent neural network
author_facet Guang Chen
Lei Tao
Ying Li
author_sort Guang Chen
title Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing Model
title_short Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing Model
title_full Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing Model
title_fullStr Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing Model
title_full_unstemmed Predicting Polymers’ Glass Transition Temperature by a Chemical Language Processing Model
title_sort predicting polymers’ glass transition temperature by a chemical language processing model
publisher MDPI AG
series Polymers
issn 2073-4360
publishDate 2021-06-01
description We propose a chemical language processing model to predict polymers’ glass transition temperature (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>g</mi></msub></semantics></math></inline-formula>) through a polymer language (SMILES, Simplified Molecular Input Line Entry System) embedding and recurrent neural network. This model only receives the SMILES strings of a polymer’s repeat units as inputs and considers the SMILES strings as sequential data at the character level. Using this method, there is no need to calculate any additional molecular descriptors or fingerprints of polymers, and thereby, being very computationally efficient. More importantly, it avoids the difficulties to generate molecular descriptors for repeat units containing polymerization point ‘*’. Results show that the trained model demonstrates reasonable prediction performance on unseen polymer’s <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>g</mi></msub></semantics></math></inline-formula>. Besides, this model is further applied for high-throughput screening on an unlabeled polymer database to identify high-temperature polymers that are desired for applications in extreme environments. Our work demonstrates that the SMILES strings of polymer repeat units can be used as an effective feature representation to develop a chemical language processing model for predictions of polymer <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msub><mi>T</mi><mi>g</mi></msub></semantics></math></inline-formula>. The framework of this model is general and can be used to construct structure–property relationships for other polymer properties.
topic polymer informatics
machine learning
glass transition temperature
high-throughput screening
recurrent neural network
url https://www.mdpi.com/2073-4360/13/11/1898
work_keys_str_mv AT guangchen predictingpolymersglasstransitiontemperaturebyachemicallanguageprocessingmodel
AT leitao predictingpolymersglasstransitiontemperaturebyachemicallanguageprocessingmodel
AT yingli predictingpolymersglasstransitiontemperaturebyachemicallanguageprocessingmodel
_version_ 1721350995393380352