Efficiency of different measures for defining the applicability domain of classification models

Abstract The goal of defining an applicability domain for a predictive classification model is to identify the region in chemical space where the model’s predictions are reliable. The boundary of the applicability domain is defined with the help of a measure that shall reflect the reliability of an...

Full description

Bibliographic Details
Main Authors:	Waldemar Klingspohn, Miriam Mathea, Antonius ter Laak, Nikolaus Heinrich, Knut Baumann
Format:	Article
Language:	English
Published:	BMC 2017-08-01
Series:	Journal of Cheminformatics
Subjects:	Applicability domain Applicability domain measures Reject option Novelty detection Confidence estimation Class probability estimation
Online Access:	http://link.springer.com/article/10.1186/s13321-017-0230-2

id	doaj-5ee9fb92b3e141c986510c106c77ee44
record_format	Article
spelling	doaj-5ee9fb92b3e141c986510c106c77ee442020-11-24T21:59:47ZengBMCJournal of Cheminformatics1758-29462017-08-019111710.1186/s13321-017-0230-2Efficiency of different measures for defining the applicability domain of classification modelsWaldemar Klingspohn0Miriam Mathea1Antonius ter Laak2Nikolaus Heinrich3Knut Baumann4Institute of Medicinal and Pharmaceutical Chemistry, University of Technology BraunschweigInstitute of Medicinal and Pharmaceutical Chemistry, University of Technology BraunschweigBayer Pharma Aktiengesellschaft, Computational ChemistryBayer Pharma Aktiengesellschaft, Computational ChemistryInstitute of Medicinal and Pharmaceutical Chemistry, University of Technology BraunschweigAbstract The goal of defining an applicability domain for a predictive classification model is to identify the region in chemical space where the model’s predictions are reliable. The boundary of the applicability domain is defined with the help of a measure that shall reflect the reliability of an individual prediction. Here, the available measures are differentiated into those that flag unusual objects and which are independent of the original classifier and those that use information of the trained classifier. The former set of techniques is referred to as novelty detection while the latter is designated as confidence estimation. A review of the available confidence estimators shows that most of these measures estimate the probability of class membership of the predicted objects which is inversely related to the error probability. Thus, class probability estimates are natural candidates for defining the applicability domain but were not comprehensively included in previous benchmark studies. The focus of the present study is to find the best measure for defining the applicability domain for a given binary classification technique and to determine the performance of novelty detection versus confidence estimation. Six different binary classification techniques in combination with ten data sets were studied to benchmark the various measures. The area under the receiver operating characteristic curve (AUC ROC) was employed as main benchmark criterion. It is shown that class probability estimates constantly perform best to differentiate between reliable and unreliable predictions. Previously proposed alternatives to class probability estimates do not perform better than the latter and are inferior in most cases. Interestingly, the impact of defining an applicability domain depends on the observed area under the receiver operator characteristic curve. That means that it depends on the level of difficulty of the classification problem (expressed as AUC ROC) and will be largest for intermediately difficult problems (range AUC ROC 0.7–0.9). In the ranking of classifiers, classification random forests performed best on average. Hence, classification random forests in combination with the respective class probability estimate are a good starting point for predictive binary chemoinformatic classifiers with applicability domain. Graphical abstract .http://link.springer.com/article/10.1186/s13321-017-0230-2Applicability domainApplicability domain measuresReject optionNovelty detectionConfidence estimationClass probability estimation
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Waldemar Klingspohn Miriam Mathea Antonius ter Laak Nikolaus Heinrich Knut Baumann
spellingShingle	Waldemar Klingspohn Miriam Mathea Antonius ter Laak Nikolaus Heinrich Knut Baumann Efficiency of different measures for defining the applicability domain of classification models Journal of Cheminformatics Applicability domain Applicability domain measures Reject option Novelty detection Confidence estimation Class probability estimation
author_facet	Waldemar Klingspohn Miriam Mathea Antonius ter Laak Nikolaus Heinrich Knut Baumann
author_sort	Waldemar Klingspohn
title	Efficiency of different measures for defining the applicability domain of classification models
title_short	Efficiency of different measures for defining the applicability domain of classification models
title_full	Efficiency of different measures for defining the applicability domain of classification models
title_fullStr	Efficiency of different measures for defining the applicability domain of classification models
title_full_unstemmed	Efficiency of different measures for defining the applicability domain of classification models
title_sort	efficiency of different measures for defining the applicability domain of classification models
publisher	BMC
series	Journal of Cheminformatics
issn	1758-2946
publishDate	2017-08-01
description	Abstract The goal of defining an applicability domain for a predictive classification model is to identify the region in chemical space where the model’s predictions are reliable. The boundary of the applicability domain is defined with the help of a measure that shall reflect the reliability of an individual prediction. Here, the available measures are differentiated into those that flag unusual objects and which are independent of the original classifier and those that use information of the trained classifier. The former set of techniques is referred to as novelty detection while the latter is designated as confidence estimation. A review of the available confidence estimators shows that most of these measures estimate the probability of class membership of the predicted objects which is inversely related to the error probability. Thus, class probability estimates are natural candidates for defining the applicability domain but were not comprehensively included in previous benchmark studies. The focus of the present study is to find the best measure for defining the applicability domain for a given binary classification technique and to determine the performance of novelty detection versus confidence estimation. Six different binary classification techniques in combination with ten data sets were studied to benchmark the various measures. The area under the receiver operating characteristic curve (AUC ROC) was employed as main benchmark criterion. It is shown that class probability estimates constantly perform best to differentiate between reliable and unreliable predictions. Previously proposed alternatives to class probability estimates do not perform better than the latter and are inferior in most cases. Interestingly, the impact of defining an applicability domain depends on the observed area under the receiver operator characteristic curve. That means that it depends on the level of difficulty of the classification problem (expressed as AUC ROC) and will be largest for intermediately difficult problems (range AUC ROC 0.7–0.9). In the ranking of classifiers, classification random forests performed best on average. Hence, classification random forests in combination with the respective class probability estimate are a good starting point for predictive binary chemoinformatic classifiers with applicability domain. Graphical abstract .
topic	Applicability domain Applicability domain measures Reject option Novelty detection Confidence estimation Class probability estimation
url	http://link.springer.com/article/10.1186/s13321-017-0230-2
work_keys_str_mv	AT waldemarklingspohn efficiencyofdifferentmeasuresfordefiningtheapplicabilitydomainofclassificationmodels AT miriammathea efficiencyofdifferentmeasuresfordefiningtheapplicabilitydomainofclassificationmodels AT antoniusterlaak efficiencyofdifferentmeasuresfordefiningtheapplicabilitydomainofclassificationmodels AT nikolausheinrich efficiencyofdifferentmeasuresfordefiningtheapplicabilitydomainofclassificationmodels AT knutbaumann efficiencyofdifferentmeasuresfordefiningtheapplicabilitydomainofclassificationmodels
_version_	1725847165387931648

Efficiency of different measures for defining the applicability domain of classification models

Similar Items