A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.

碩士 === 國立成功大學 === 資訊管理研究所 === 102 === The performance of a classification algorithm is generally evaluated by K-fold cross validation to find the one that has the highest accuracy. Then the model induced from all available data by the best classification algorithm, called full sample model, is used...

Full description

Bibliographic Details
Main Authors:	Chiao-YingLin, 林巧盈
Other Authors:	Tzu-Tsung Wong
Format:	Others
Language:	zh-TW
Published:	2014
Online Access:	http://ndltd.ncl.edu.tw/handle/23699989925707105417

id	ndltd-TW-102NCKU5396012
record_format	oai_dc
spelling	ndltd-TW-102NCKU53960122016-03-07T04:10:57Z http://ndltd.ncl.edu.tw/handle/23699989925707105417 A study on the selection error rate of classification algorithms evaluated by k-fold cross validation. 探討K等分交叉驗證法對於分類器錯選率之研究 Chiao-YingLin 林巧盈碩士國立成功大學資訊管理研究所 102 The performance of a classification algorithm is generally evaluated by K-fold cross validation to find the one that has the highest accuracy. Then the model induced from all available data by the best classification algorithm, called full sample model, is used for prediction and interpretation. Since there are no extra data to evaluate the full sample model resulting from the best algorithm, its prediction accuracy can be less than the accuracy of the full sample model induced by the other classification algorithm, and this is called a selection error. This study designs an experiment to calculate and estimate the selection error rate, and attempts to propose a new model for reducing selection error rate. The classification algorithms considered in this study are decision tree, naïve Bayesian classifier, logistic regression, and support vector machine. The experimental results on 30 data sets show that the actual and estimated selection error rates can be greatly different in several cases. The new model that has the median accuracy can reduce the selection error rate without sacrificing the prediction accuracy. Tzu-Tsung Wong 翁慈宗 2014 學位論文 ; thesis 55 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立成功大學 === 資訊管理研究所 === 102 === The performance of a classification algorithm is generally evaluated by K-fold cross validation to find the one that has the highest accuracy. Then the model induced from all available data by the best classification algorithm, called full sample model, is used for prediction and interpretation. Since there are no extra data to evaluate the full sample model resulting from the best algorithm, its prediction accuracy can be less than the accuracy of the full sample model induced by the other classification algorithm, and this is called a selection error. This study designs an experiment to calculate and estimate the selection error rate, and attempts to propose a new model for reducing selection error rate. The classification algorithms considered in this study are decision tree, naïve Bayesian classifier, logistic regression, and support vector machine. The experimental results on 30 data sets show that the actual and estimated selection error rates can be greatly different in several cases. The new model that has the median accuracy can reduce the selection error rate without sacrificing the prediction accuracy.
author2	Tzu-Tsung Wong
author_facet	Tzu-Tsung Wong Chiao-YingLin 林巧盈
author	Chiao-YingLin 林巧盈
spellingShingle	Chiao-YingLin 林巧盈 A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.
author_sort	Chiao-YingLin
title	A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.
title_short	A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.
title_full	A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.
title_fullStr	A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.
title_full_unstemmed	A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.
title_sort	study on the selection error rate of classification algorithms evaluated by k-fold cross validation.
publishDate	2014
url	http://ndltd.ncl.edu.tw/handle/23699989925707105417
work_keys_str_mv	AT chiaoyinglin astudyontheselectionerrorrateofclassificationalgorithmsevaluatedbykfoldcrossvalidation AT línqiǎoyíng astudyontheselectionerrorrateofclassificationalgorithmsevaluatedbykfoldcrossvalidation AT chiaoyinglin tàntǎokděngfēnjiāochāyànzhèngfǎduìyúfēnlèiqìcuòxuǎnlǜzhīyánjiū AT línqiǎoyíng tàntǎokděngfēnjiāochāyànzhèngfǎduìyúfēnlèiqìcuòxuǎnlǜzhīyánjiū AT chiaoyinglin studyontheselectionerrorrateofclassificationalgorithmsevaluatedbykfoldcrossvalidation AT línqiǎoyíng studyontheselectionerrorrateofclassificationalgorithmsevaluatedbykfoldcrossvalidation
_version_	1718199555436052480

A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.

Similar Items