Domain-Based Benchmark Experiments: Exploratory and Inferential Analysis

Benchmark experiments are the method of choice to compare learning algorithms empirically. For collections of data sets, the empirical performance distributions of a set of learning algorithms are estimated, compared, and ordered. Usually this is done for each data set separately. The present manus...

Full description

Bibliographic Details
Main Authors: Manuel J. A. Eugster, Torsten Hothorn, Friedrich Leisch
Format: Article
Language:English
Published: Austrian Statistical Society 2016-02-01
Series:Austrian Journal of Statistics
Online Access:http://www.ajs.or.at/index.php/ajs/article/view/185
id doaj-a5c32a4599be4278ae83356947dc57d0
record_format Article
spelling doaj-a5c32a4599be4278ae83356947dc57d02021-04-22T12:34:51ZengAustrian Statistical SocietyAustrian Journal of Statistics1026-597X2016-02-0141110.17713/ajs.v41i1.185Domain-Based Benchmark Experiments: Exploratory and Inferential AnalysisManuel J. A. Eugster0Torsten Hothorn1Friedrich Leisch2Institut für Statistik, LMU München, GermanyInstitut für Statistik, LMU München, GermanyInstitut für Angewandte Statistik und EDV, BOKU Wien, Austria Benchmark experiments are the method of choice to compare learning algorithms empirically. For collections of data sets, the empirical performance distributions of a set of learning algorithms are estimated, compared, and ordered. Usually this is done for each data set separately. The present manuscript extends this single data set-based approach to a joint analysis for the complete collection, the so called problem domain. This enables to decide which algorithms to deploy in a specific application or to compare newly developed algorithms with well-known algorithms on established problem domains. Specialized visualization methods allow for easy exploration of huge amounts of benchmark data. Furthermore, we take the benchmark experiment design into account and use mixed-effects models to provide a formal statistical analysis. Two domain-based benchmark experiments demonstrate our methods: the UCI domain as a well-known domain when one is developing a new algorithm; and the Grasshopper domain as a domain where we want to find the  best learning algorithm for a prediction component in an enterprise application software system. http://www.ajs.or.at/index.php/ajs/article/view/185
collection DOAJ
language English
format Article
sources DOAJ
author Manuel J. A. Eugster
Torsten Hothorn
Friedrich Leisch
spellingShingle Manuel J. A. Eugster
Torsten Hothorn
Friedrich Leisch
Domain-Based Benchmark Experiments: Exploratory and Inferential Analysis
Austrian Journal of Statistics
author_facet Manuel J. A. Eugster
Torsten Hothorn
Friedrich Leisch
author_sort Manuel J. A. Eugster
title Domain-Based Benchmark Experiments: Exploratory and Inferential Analysis
title_short Domain-Based Benchmark Experiments: Exploratory and Inferential Analysis
title_full Domain-Based Benchmark Experiments: Exploratory and Inferential Analysis
title_fullStr Domain-Based Benchmark Experiments: Exploratory and Inferential Analysis
title_full_unstemmed Domain-Based Benchmark Experiments: Exploratory and Inferential Analysis
title_sort domain-based benchmark experiments: exploratory and inferential analysis
publisher Austrian Statistical Society
series Austrian Journal of Statistics
issn 1026-597X
publishDate 2016-02-01
description Benchmark experiments are the method of choice to compare learning algorithms empirically. For collections of data sets, the empirical performance distributions of a set of learning algorithms are estimated, compared, and ordered. Usually this is done for each data set separately. The present manuscript extends this single data set-based approach to a joint analysis for the complete collection, the so called problem domain. This enables to decide which algorithms to deploy in a specific application or to compare newly developed algorithms with well-known algorithms on established problem domains. Specialized visualization methods allow for easy exploration of huge amounts of benchmark data. Furthermore, we take the benchmark experiment design into account and use mixed-effects models to provide a formal statistical analysis. Two domain-based benchmark experiments demonstrate our methods: the UCI domain as a well-known domain when one is developing a new algorithm; and the Grasshopper domain as a domain where we want to find the  best learning algorithm for a prediction component in an enterprise application software system.
url http://www.ajs.or.at/index.php/ajs/article/view/185
work_keys_str_mv AT manueljaeugster domainbasedbenchmarkexperimentsexploratoryandinferentialanalysis
AT torstenhothorn domainbasedbenchmarkexperimentsexploratoryandinferentialanalysis
AT friedrichleisch domainbasedbenchmarkexperimentsexploratoryandinferentialanalysis
_version_ 1721514373548081152