Multi-objective ROC learning for classification

Receiver operating characteristic (ROC) curves are widely used for evaluating classifier performance, having been applied to e.g. signal detection, medical diagnostics and safety critical systems. They allow examination of the trade-offs between true and false positive rates as misclassification cos...

Full description

Bibliographic Details
Main Author:	Clark, Andrew Robert James
Other Authors:	Everson, Richard
Published:	University of Exeter 2011
Subjects:	519.6 Relevance Vector Machine : Multi-objective optimisation : ROC curves : Classification : Approximate Bayesian Computation : Cross-validation : Evolutionary algorithm : Multi-resolution kernels
Online Access:	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.550687

id	ndltd-bl.uk-oai-ethos.bl.uk-550687
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-5506872015-03-20T04:04:42ZMulti-objective ROC learning for classificationClark, Andrew Robert JamesEverson, Richard2011Receiver operating characteristic (ROC) curves are widely used for evaluating classifier performance, having been applied to e.g. signal detection, medical diagnostics and safety critical systems. They allow examination of the trade-offs between true and false positive rates as misclassification costs are varied. Examination of the resulting graphs and calcu- lation of the area under the ROC curve (AUC) allows assessment of how well a classifier is able to separate two classes and allows selection of an operating point with full knowledge of the available trade-offs. In this thesis a multi-objective evolutionary algorithm (MOEA) is used to find clas- sifiers whose ROC graph locations are Pareto optimal. The Relevance Vector Machine (RVM) is a state-of-the-art classifier that produces sparse Bayesian models, but is unfor- tunately prone to overfitting. Using the MOEA, hyper-parameters for RVM classifiers are set, optimising them not only in terms of true and false positive rates but also a novel measure of RVM complexity, thus encouraging sparseness, and producing approximations to the Pareto front. Several methods for regularising the RVM during the MOEA train- ing process are examined and their performance evaluated on a number of benchmark datasets demonstrating they possess the capability to avoid overfitting whilst producing performance equivalent to that of the maximum likelihood trained RVM. A common task in bioinformatics is to identify genes associated with various genetic conditions by finding those genes useful for classifying a condition against a baseline. Typ- ically, datasets contain large numbers of gene expressions measured in relatively few sub- jects. As a result of the high dimensionality and sparsity of examples, it can be very easy to find classifiers with near perfect training accuracies but which have poor generalisation capability. Additionally, depending on the condition and treatment involved, evaluation over a range of costs will often be desirable. An MOEA is used to identify genes for clas- sification by simultaneously maximising the area under the ROC curve whilst minimising model complexity. This method is illustrated on a number of well-studied datasets and ap- plied to a recent bioinformatics database resulting from the current InChianti population study. Many classifiers produce “hard”, non-probabilistic classifications and are trained to find a single set of parameters, whose values are inevitably uncertain due to limited available training data. In a Bayesian framework it is possible to ameliorate the effects of this parameter uncertainty by averaging over classifiers weighted by their posterior probabil- ity. Unfortunately, the required posterior probability is not readily computed for hard classifiers. In this thesis an Approximate Bayesian Computation Markov Chain Monte Carlo algorithm is used to sample model parameters for a hard classifier using the AUC as a measure of performance. The ability to produce ROC curves close to the Bayes op- timal ROC curve is demonstrated on a synthetic dataset. Due to the large numbers of sampled parametrisations, averaging over them when rapid classification is needed may be impractical and thus methods for producing sparse weightings are investigated.519.6Relevance Vector Machine : Multi-objective optimisation : ROC curves : Classification : Approximate Bayesian Computation : Cross-validation : Evolutionary algorithm : Multi-resolution kernelsUniversity of Exeterhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.550687http://hdl.handle.net/10036/3530Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	519.6 Relevance Vector Machine : Multi-objective optimisation : ROC curves : Classification : Approximate Bayesian Computation : Cross-validation : Evolutionary algorithm : Multi-resolution kernels
spellingShingle	519.6 Relevance Vector Machine : Multi-objective optimisation : ROC curves : Classification : Approximate Bayesian Computation : Cross-validation : Evolutionary algorithm : Multi-resolution kernels Clark, Andrew Robert James Multi-objective ROC learning for classification
description	Receiver operating characteristic (ROC) curves are widely used for evaluating classifier performance, having been applied to e.g. signal detection, medical diagnostics and safety critical systems. They allow examination of the trade-offs between true and false positive rates as misclassification costs are varied. Examination of the resulting graphs and calcu- lation of the area under the ROC curve (AUC) allows assessment of how well a classifier is able to separate two classes and allows selection of an operating point with full knowledge of the available trade-offs. In this thesis a multi-objective evolutionary algorithm (MOEA) is used to find clas- sifiers whose ROC graph locations are Pareto optimal. The Relevance Vector Machine (RVM) is a state-of-the-art classifier that produces sparse Bayesian models, but is unfor- tunately prone to overfitting. Using the MOEA, hyper-parameters for RVM classifiers are set, optimising them not only in terms of true and false positive rates but also a novel measure of RVM complexity, thus encouraging sparseness, and producing approximations to the Pareto front. Several methods for regularising the RVM during the MOEA train- ing process are examined and their performance evaluated on a number of benchmark datasets demonstrating they possess the capability to avoid overfitting whilst producing performance equivalent to that of the maximum likelihood trained RVM. A common task in bioinformatics is to identify genes associated with various genetic conditions by finding those genes useful for classifying a condition against a baseline. Typ- ically, datasets contain large numbers of gene expressions measured in relatively few sub- jects. As a result of the high dimensionality and sparsity of examples, it can be very easy to find classifiers with near perfect training accuracies but which have poor generalisation capability. Additionally, depending on the condition and treatment involved, evaluation over a range of costs will often be desirable. An MOEA is used to identify genes for clas- sification by simultaneously maximising the area under the ROC curve whilst minimising model complexity. This method is illustrated on a number of well-studied datasets and ap- plied to a recent bioinformatics database resulting from the current InChianti population study. Many classifiers produce “hard”, non-probabilistic classifications and are trained to find a single set of parameters, whose values are inevitably uncertain due to limited available training data. In a Bayesian framework it is possible to ameliorate the effects of this parameter uncertainty by averaging over classifiers weighted by their posterior probabil- ity. Unfortunately, the required posterior probability is not readily computed for hard classifiers. In this thesis an Approximate Bayesian Computation Markov Chain Monte Carlo algorithm is used to sample model parameters for a hard classifier using the AUC as a measure of performance. The ability to produce ROC curves close to the Bayes op- timal ROC curve is demonstrated on a synthetic dataset. Due to the large numbers of sampled parametrisations, averaging over them when rapid classification is needed may be impractical and thus methods for producing sparse weightings are investigated.
author2	Everson, Richard
author_facet	Everson, Richard Clark, Andrew Robert James
author	Clark, Andrew Robert James
author_sort	Clark, Andrew Robert James
title	Multi-objective ROC learning for classification
title_short	Multi-objective ROC learning for classification
title_full	Multi-objective ROC learning for classification
title_fullStr	Multi-objective ROC learning for classification
title_full_unstemmed	Multi-objective ROC learning for classification
title_sort	multi-objective roc learning for classification
publisher	University of Exeter
publishDate	2011
url	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.550687
work_keys_str_mv	AT clarkandrewrobertjames multiobjectiveroclearningforclassification
_version_	1716783771246657536

Multi-objective ROC learning for classification

Similar Items