adabag: An R Package for Classification with Boosting and Bagging
Boosting and bagging are two widely used ensemble methods for classification. Their common goal is to improve the accuracy of a classifier combining single classifiers which are slightly better than random guessing. Among the family of boosting algorithms, AdaBoost (adaptive boosting) is the best kn...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Foundation for Open Access Statistics
2013-09-01
|
Series: | Journal of Statistical Software |
Online Access: | http://www.jstatsoft.org/index.php/jss/article/view/2082 |
id |
doaj-7b5d72b4c97047cc824395f9a827b83f |
---|---|
record_format |
Article |
spelling |
doaj-7b5d72b4c97047cc824395f9a827b83f2020-11-24T22:07:26ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602013-09-0154113510.18637/jss.v054.i02686adabag: An R Package for Classification with Boosting and BaggingEsteban AlfaroMatias GamezNoelia GarcíaBoosting and bagging are two widely used ensemble methods for classification. Their common goal is to improve the accuracy of a classifier combining single classifiers which are slightly better than random guessing. Among the family of boosting algorithms, AdaBoost (adaptive boosting) is the best known, although it is suitable only for dichotomous tasks. AdaBoost.M1 and SAMME (stagewise additive modeling using a multi-class exponential loss function) are two easy and natural extensions to the general case of two or more classes. In this paper, the adabag R package is introduced. This version implements AdaBoost.M1, SAMME and bagging algorithms with classification trees as base classifiers. Once the ensembles have been trained, they can be used to predict the class of new samples. The accuracy of these classifiers can be estimated in a separated data set or through cross validation. Moreover, the evolution of the error as the ensemble grows can be analysed and the ensemble can be pruned. In addition, the margin in the class prediction and the probability of each class for the observations can be calculated. Finally, several classic examples in classification literature are shown to illustrate the use of this package.http://www.jstatsoft.org/index.php/jss/article/view/2082 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Esteban Alfaro Matias Gamez Noelia García |
spellingShingle |
Esteban Alfaro Matias Gamez Noelia García adabag: An R Package for Classification with Boosting and Bagging Journal of Statistical Software |
author_facet |
Esteban Alfaro Matias Gamez Noelia García |
author_sort |
Esteban Alfaro |
title |
adabag: An R Package for Classification with Boosting and Bagging |
title_short |
adabag: An R Package for Classification with Boosting and Bagging |
title_full |
adabag: An R Package for Classification with Boosting and Bagging |
title_fullStr |
adabag: An R Package for Classification with Boosting and Bagging |
title_full_unstemmed |
adabag: An R Package for Classification with Boosting and Bagging |
title_sort |
adabag: an r package for classification with boosting and bagging |
publisher |
Foundation for Open Access Statistics |
series |
Journal of Statistical Software |
issn |
1548-7660 |
publishDate |
2013-09-01 |
description |
Boosting and bagging are two widely used ensemble methods for classification. Their common goal is to improve the accuracy of a classifier combining single classifiers which are slightly better than random guessing. Among the family of boosting algorithms, AdaBoost (adaptive boosting) is the best known, although it is suitable only for dichotomous tasks. AdaBoost.M1 and SAMME (stagewise additive modeling using a multi-class exponential loss function) are two easy and natural extensions to the general case of two or more classes. In this paper, the adabag R package is introduced. This version implements AdaBoost.M1, SAMME and bagging algorithms with classification trees as base classifiers. Once the ensembles have been trained, they can be used to predict the class of new samples. The accuracy of these classifiers can be estimated in a separated data set or through cross validation. Moreover, the evolution of the error as the ensemble grows can be analysed and the ensemble can be pruned. In addition, the margin in the class prediction and the probability of each class for the observations can be calculated. Finally, several classic examples in classification literature are shown to illustrate the use of this package. |
url |
http://www.jstatsoft.org/index.php/jss/article/view/2082 |
work_keys_str_mv |
AT estebanalfaro adabaganrpackageforclassificationwithboostingandbagging AT matiasgamez adabaganrpackageforclassificationwithboostingandbagging AT noeliagarcia adabaganrpackageforclassificationwithboostingandbagging |
_version_ |
1725820435800522752 |