Summary: | Chemical substances are essential in all aspects of human life, and understanding their properties is essential for developing chemical systems. The properties of chemical species can be accurately obtained by experiments or ab initio computational calculations; however, these are time-consuming and costly. In this work, machine learning models (ML) for estimating entropy, S, and constant pressure heat capacity, Cp, at 298.15 K, are developed for alkanes, alkenes, and alkynes. The training data for entropy and heat capacity are collected from the literature. Molecular descriptors generated using alvaDesc software are used as input features for the ML models. Support vector regression (SVR), v-support vector regression (v-SVR), and random forest regression (RFR) algorithms were trained with K-fold cross-validation on two levels. The first level assessed the models' performance, and the second level generated the final models. Between the three ML models chosen, SVR shows better performance on the test dataset. The SVR model was then compared against traditional Benson's group additivity to illustrate the advantages of using the ML model. Finally, a sensitivity analysis is performed to find the most critical descriptors in the property estimations.
|