Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes

The reliability of a QSAR classification model depends on its capacity to achieve confident predictions of new compounds not considered in the building of the model. The results of this external validation process show the applicability domain (AD) of the QSAR model and, therefore, the robustness of...

Full description

Bibliographic Details
Main Authors: Irene Luque Ruiz, Miguel Ángel Gómez-Nieto
Format: Article
Language:English
Published: MDPI AG 2018-10-01
Series:Molecules
Subjects:
Online Access:https://www.mdpi.com/1420-3049/23/11/2756
id doaj-4ae73a3aa1084f0eb77b288231c1e163
record_format Article
spelling doaj-4ae73a3aa1084f0eb77b288231c1e1632020-11-24T22:03:18ZengMDPI AGMolecules1420-30492018-10-012311275610.3390/molecules23112756molecules23112756Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability IndexesIrene Luque Ruiz0Miguel Ángel Gómez-Nieto1Department of Computing and Numerical Analysis, Campus Universitario de Rabanales, Albert Einstein Building, University of Córdoba, E-14071 Córdoba, SpainDepartment of Computing and Numerical Analysis, Campus Universitario de Rabanales, Albert Einstein Building, University of Córdoba, E-14071 Córdoba, SpainThe reliability of a QSAR classification model depends on its capacity to achieve confident predictions of new compounds not considered in the building of the model. The results of this external validation process show the applicability domain (AD) of the QSAR model and, therefore, the robustness of the model to predict the property/activity of new molecules. In this paper we propose the use of the rivality and modelability indexes for the study of the characteristics of the datasets to be correctly modeled by a QSAR algorithm and to predict the reliability of the built model to prognosticate the property/activity of new molecules. The calculation of these indexes has a very low computational cost, not requiring the building of a model, thus being good tools for the analysis of the datasets in the first stages of the building of QSAR classification models. In our study, we have selected two benchmark datasets with similar number of molecules but with very different modelability and we have corroborated the capacity of the predictability of the rivality and modelability indexes regarding the classification models built using Support Vector Machine and Random Forest algorithms with 5-fold cross-validation and leave-one-out techniques. The results have shown the excellent ability of both indexes to predict outliers and the applicability domain of the QSAR classification models. In all cases, these values accurately predicted the statistic parameters of the QSAR models generated by the algorithms.https://www.mdpi.com/1420-3049/23/11/2756QSARclassification modelapplicability domainrivality indexmodelability index
collection DOAJ
language English
format Article
sources DOAJ
author Irene Luque Ruiz
Miguel Ángel Gómez-Nieto
spellingShingle Irene Luque Ruiz
Miguel Ángel Gómez-Nieto
Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes
Molecules
QSAR
classification model
applicability domain
rivality index
modelability index
author_facet Irene Luque Ruiz
Miguel Ángel Gómez-Nieto
author_sort Irene Luque Ruiz
title Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes
title_short Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes
title_full Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes
title_fullStr Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes
title_full_unstemmed Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes
title_sort study of the applicability domain of the qsar classification models by means of the rivality and modelability indexes
publisher MDPI AG
series Molecules
issn 1420-3049
publishDate 2018-10-01
description The reliability of a QSAR classification model depends on its capacity to achieve confident predictions of new compounds not considered in the building of the model. The results of this external validation process show the applicability domain (AD) of the QSAR model and, therefore, the robustness of the model to predict the property/activity of new molecules. In this paper we propose the use of the rivality and modelability indexes for the study of the characteristics of the datasets to be correctly modeled by a QSAR algorithm and to predict the reliability of the built model to prognosticate the property/activity of new molecules. The calculation of these indexes has a very low computational cost, not requiring the building of a model, thus being good tools for the analysis of the datasets in the first stages of the building of QSAR classification models. In our study, we have selected two benchmark datasets with similar number of molecules but with very different modelability and we have corroborated the capacity of the predictability of the rivality and modelability indexes regarding the classification models built using Support Vector Machine and Random Forest algorithms with 5-fold cross-validation and leave-one-out techniques. The results have shown the excellent ability of both indexes to predict outliers and the applicability domain of the QSAR classification models. In all cases, these values accurately predicted the statistic parameters of the QSAR models generated by the algorithms.
topic QSAR
classification model
applicability domain
rivality index
modelability index
url https://www.mdpi.com/1420-3049/23/11/2756
work_keys_str_mv AT ireneluqueruiz studyoftheapplicabilitydomainoftheqsarclassificationmodelsbymeansoftherivalityandmodelabilityindexes
AT miguelangelgomeznieto studyoftheapplicabilitydomainoftheqsarclassificationmodelsbymeansoftherivalityandmodelabilityindexes
_version_ 1725832192136839168