Multi-Objective Evolutionary Rule-Based Classification with Categorical Data
The ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model’s predictions. Models which are inherently easier to interpret can be effortlessly related to the context of th...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2018-09-01
|
Series: | Entropy |
Subjects: | |
Online Access: | http://www.mdpi.com/1099-4300/20/9/684 |
id |
doaj-95a3de4225944e51816c3322f732fd9d |
---|---|
record_format |
Article |
spelling |
doaj-95a3de4225944e51816c3322f732fd9d2020-11-24T22:03:02ZengMDPI AGEntropy1099-43002018-09-0120968410.3390/e20090684e20090684Multi-Objective Evolutionary Rule-Based Classification with Categorical DataFernando Jiménez0Carlos Martínez1Luis Miralles-Pechuán2Gracia Sánchez3Guido Sciavicco4Department of Information and Communication Engineering, University of Murcia, 30071 Murcia, SpainDepartment of Information and Communication Engineering, University of Murcia, 30071 Murcia, SpainCentre for Applied Data Analytics Research (CeADAR), University College Dublin, D04 Dublin 4, IrelandDepartment of Information and Communication Engineering, University of Murcia, 30071 Murcia, SpainDepartment of Mathematics and Computer Science, University of Ferrara, 44121 Ferrara, ItalyThe ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model’s predictions. Models which are inherently easier to interpret can be effortlessly related to the context of the problem, and their predictions can be, if necessary, ethically and legally evaluated. In this paper, we propose a novel method to generate rule-based classifiers from categorical data that can be readily interpreted. Classifiers are generated using a multi-objective optimization approach focusing on two main objectives: maximizing the performance of the learned classifier and minimizing its number of rules. The multi-objective evolutionary algorithms ENORA and NSGA-II have been adapted to optimize the performance of the classifier based on three different machine learning metrics: accuracy, area under the ROC curve, and root mean square error. We have extensively compared the generated classifiers using our proposed method with classifiers generated using classical methods such as PART, JRip, OneR and ZeroR. The experiments have been conducted in full training mode, in 10-fold cross-validation mode, and in train/test splitting mode. To make results reproducible, we have used the well-known and publicly available datasets Breast Cancer, Monk’s Problem 2, Tic-Tac-Toe-Endgame, Car, kr-vs-kp and Nursery. After performing an exhaustive statistical test on our results, we conclude that the proposed method is able to generate highly accurate and easy to interpret classification models.http://www.mdpi.com/1099-4300/20/9/684multi-objective evolutionary algorithmsrule-based classifiersinterpretable machine learningcategorical data |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Fernando Jiménez Carlos Martínez Luis Miralles-Pechuán Gracia Sánchez Guido Sciavicco |
spellingShingle |
Fernando Jiménez Carlos Martínez Luis Miralles-Pechuán Gracia Sánchez Guido Sciavicco Multi-Objective Evolutionary Rule-Based Classification with Categorical Data Entropy multi-objective evolutionary algorithms rule-based classifiers interpretable machine learning categorical data |
author_facet |
Fernando Jiménez Carlos Martínez Luis Miralles-Pechuán Gracia Sánchez Guido Sciavicco |
author_sort |
Fernando Jiménez |
title |
Multi-Objective Evolutionary Rule-Based Classification with Categorical Data |
title_short |
Multi-Objective Evolutionary Rule-Based Classification with Categorical Data |
title_full |
Multi-Objective Evolutionary Rule-Based Classification with Categorical Data |
title_fullStr |
Multi-Objective Evolutionary Rule-Based Classification with Categorical Data |
title_full_unstemmed |
Multi-Objective Evolutionary Rule-Based Classification with Categorical Data |
title_sort |
multi-objective evolutionary rule-based classification with categorical data |
publisher |
MDPI AG |
series |
Entropy |
issn |
1099-4300 |
publishDate |
2018-09-01 |
description |
The ease of interpretation of a classification model is essential for the task of validating it. Sometimes it is required to clearly explain the classification process of a model’s predictions. Models which are inherently easier to interpret can be effortlessly related to the context of the problem, and their predictions can be, if necessary, ethically and legally evaluated. In this paper, we propose a novel method to generate rule-based classifiers from categorical data that can be readily interpreted. Classifiers are generated using a multi-objective optimization approach focusing on two main objectives: maximizing the performance of the learned classifier and minimizing its number of rules. The multi-objective evolutionary algorithms ENORA and NSGA-II have been adapted to optimize the performance of the classifier based on three different machine learning metrics: accuracy, area under the ROC curve, and root mean square error. We have extensively compared the generated classifiers using our proposed method with classifiers generated using classical methods such as PART, JRip, OneR and ZeroR. The experiments have been conducted in full training mode, in 10-fold cross-validation mode, and in train/test splitting mode. To make results reproducible, we have used the well-known and publicly available datasets Breast Cancer, Monk’s Problem 2, Tic-Tac-Toe-Endgame, Car, kr-vs-kp and Nursery. After performing an exhaustive statistical test on our results, we conclude that the proposed method is able to generate highly accurate and easy to interpret classification models. |
topic |
multi-objective evolutionary algorithms rule-based classifiers interpretable machine learning categorical data |
url |
http://www.mdpi.com/1099-4300/20/9/684 |
work_keys_str_mv |
AT fernandojimenez multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata AT carlosmartinez multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata AT luismirallespechuan multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata AT graciasanchez multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata AT guidosciavicco multiobjectiveevolutionaryrulebasedclassificationwithcategoricaldata |
_version_ |
1725833515279319040 |