The design and analysis of benchmark experiments

The assessment of the performance of learners by means of benchmark experiments is established exercise. In practice, benchmark studies are a tool to compare the performance of several competing algorithms for a certain learning problem. Cross-validation or resampling techniques are commonly used to...

Full description

Bibliographic Details
Main Authors:	Hothorn, Torsten, Leisch, Friedrich, Zeileis, Achim, Hornik, Kurt
Format:	Others
Language:	en
Published:	SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business 2003
Subjects:	Model Comparison / Performance / Hypothesis Testing / Cross-Validation / Bootstrap
Online Access:	http://epub.wu.ac.at/758/1/document.pdf

id	ndltd-VIENNA-oai-epub.wu-wien.ac.at-epub-wu-01_59a
record_format	oai_dc
spelling	ndltd-VIENNA-oai-epub.wu-wien.ac.at-epub-wu-01_59a2017-02-28T05:22:38Z The design and analysis of benchmark experiments Hothorn, Torsten Leisch, Friedrich Zeileis, Achim Hornik, Kurt Model Comparison / Performance / Hypothesis Testing / Cross-Validation / Bootstrap The assessment of the performance of learners by means of benchmark experiments is established exercise. In practice, benchmark studies are a tool to compare the performance of several competing algorithms for a certain learning problem. Cross-validation or resampling techniques are commonly used to derive point estimates of the performances which are compared to identify algorithms with good properties. For several benchmarking problems, test procedures taking the variability of those point estimates into account have been suggested. Most of the recently proposed inference procedures are based on special variance estimators for the cross-validated performance. We introduce a theoretical framework for inference problems in benchmark experiments and show that standard statistical test procedures can be used to test for differences in the performances. The theory is based on well defined distributions of performance measures which can be compared with established tests. To demonstrate the usefulness in practice, the theoretical results are applied to benchmark studies in a supervised learning situation based on artificial and real-world data. SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business 2003 Paper NonPeerReviewed en application/pdf http://epub.wu.ac.at/758/1/document.pdf Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science" http://epub.wu.ac.at/758/
collection	NDLTD
language	en
format	Others
sources	NDLTD
topic	Model Comparison / Performance / Hypothesis Testing / Cross-Validation / Bootstrap
spellingShingle	Model Comparison / Performance / Hypothesis Testing / Cross-Validation / Bootstrap Hothorn, Torsten Leisch, Friedrich Zeileis, Achim Hornik, Kurt The design and analysis of benchmark experiments
description	The assessment of the performance of learners by means of benchmark experiments is established exercise. In practice, benchmark studies are a tool to compare the performance of several competing algorithms for a certain learning problem. Cross-validation or resampling techniques are commonly used to derive point estimates of the performances which are compared to identify algorithms with good properties. For several benchmarking problems, test procedures taking the variability of those point estimates into account have been suggested. Most of the recently proposed inference procedures are based on special variance estimators for the cross-validated performance. We introduce a theoretical framework for inference problems in benchmark experiments and show that standard statistical test procedures can be used to test for differences in the performances. The theory is based on well defined distributions of performance measures which can be compared with established tests. To demonstrate the usefulness in practice, the theoretical results are applied to benchmark studies in a supervised learning situation based on artificial and real-world data. === Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
author	Hothorn, Torsten Leisch, Friedrich Zeileis, Achim Hornik, Kurt
author_facet	Hothorn, Torsten Leisch, Friedrich Zeileis, Achim Hornik, Kurt
author_sort	Hothorn, Torsten
title	The design and analysis of benchmark experiments
title_short	The design and analysis of benchmark experiments
title_full	The design and analysis of benchmark experiments
title_fullStr	The design and analysis of benchmark experiments
title_full_unstemmed	The design and analysis of benchmark experiments
title_sort	design and analysis of benchmark experiments
publisher	SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business
publishDate	2003
url	http://epub.wu.ac.at/758/1/document.pdf
work_keys_str_mv	AT hothorntorsten thedesignandanalysisofbenchmarkexperiments AT leischfriedrich thedesignandanalysisofbenchmarkexperiments AT zeileisachim thedesignandanalysisofbenchmarkexperiments AT hornikkurt thedesignandanalysisofbenchmarkexperiments AT hothorntorsten designandanalysisofbenchmarkexperiments AT leischfriedrich designandanalysisofbenchmarkexperiments AT zeileisachim designandanalysisofbenchmarkexperiments AT hornikkurt designandanalysisofbenchmarkexperiments
_version_	1718417164885557248

The design and analysis of benchmark experiments

Similar Items