evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R
Commonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search. Although this approach is known to be an efficient heuristic, the results of recursive tree methods are only locally optimal, as spli...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Foundation for Open Access Statistics
2014-10-01
|
Series: | Journal of Statistical Software |
Online Access: | http://www.jstatsoft.org/index.php/jss/article/view/2189 |
id |
doaj-538d7ace92f548f2873f3b3bfb31bdca |
---|---|
record_format |
Article |
spelling |
doaj-538d7ace92f548f2873f3b3bfb31bdca2020-11-24T23:24:27ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602014-10-0161112910.18637/jss.v061.i01793evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in RThomas GrubingerAchim ZeileisKarl-Peter PfeifferCommonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search. Although this approach is known to be an efficient heuristic, the results of recursive tree methods are only locally optimal, as splits are chosen to maximize homogeneity at the next step only. An alternative way to search over the parameter space of trees is to use global optimization methods like evolutionary algorithms. This paper describes the evtree package, which implements an evolutionary algorithm for learning globally optimal classification and regression trees in R. Computationally intensive tasks are fully computed in C++ while the partykit package is leveraged for representing the resulting trees in R, providing unified infrastructure for summaries, visualizations, and predictions. evtree is compared to the open-source CART implementation rpart, conditional inference trees (ctree), and the open-source C4.5 implementation J48. A benchmark study of predictive accuracy and complexity is carried out in which evtree achieved at least similar and most of the time better results compared to rpart, ctree, and J48. Furthermore, the usefulness of evtree in practice is illustrated in a textbook customer classification task.http://www.jstatsoft.org/index.php/jss/article/view/2189 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Thomas Grubinger Achim Zeileis Karl-Peter Pfeiffer |
spellingShingle |
Thomas Grubinger Achim Zeileis Karl-Peter Pfeiffer evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R Journal of Statistical Software |
author_facet |
Thomas Grubinger Achim Zeileis Karl-Peter Pfeiffer |
author_sort |
Thomas Grubinger |
title |
evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R |
title_short |
evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R |
title_full |
evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R |
title_fullStr |
evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R |
title_full_unstemmed |
evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R |
title_sort |
evtree: evolutionary learning of globally optimal classification and regression trees in r |
publisher |
Foundation for Open Access Statistics |
series |
Journal of Statistical Software |
issn |
1548-7660 |
publishDate |
2014-10-01 |
description |
Commonly used classification and regression tree methods like the CART algorithm are recursive partitioning methods that build the model in a forward stepwise search. Although this approach is known to be an efficient heuristic, the results of recursive tree methods are only locally optimal, as splits are chosen to maximize homogeneity at the next step only. An alternative way to search over the parameter space of trees is to use global optimization methods like evolutionary algorithms. This paper describes the evtree package, which implements an evolutionary algorithm for learning globally optimal classification and regression trees in R. Computationally intensive tasks are fully computed in C++ while the partykit package is leveraged for representing the resulting trees in R, providing unified infrastructure for summaries, visualizations, and predictions. evtree is compared to the open-source CART implementation rpart, conditional inference trees (ctree), and the open-source C4.5 implementation J48. A benchmark study of predictive accuracy and complexity is carried out in which evtree achieved at least similar and most of the time better results compared to rpart, ctree, and J48. Furthermore, the usefulness of evtree in practice is illustrated in a textbook customer classification task. |
url |
http://www.jstatsoft.org/index.php/jss/article/view/2189 |
work_keys_str_mv |
AT thomasgrubinger evtreeevolutionarylearningofgloballyoptimalclassificationandregressiontreesinr AT achimzeileis evtreeevolutionarylearningofgloballyoptimalclassificationandregressiontreesinr AT karlpeterpfeiffer evtreeevolutionarylearningofgloballyoptimalclassificationandregressiontreesinr |
_version_ |
1725560533796519936 |