A study on the selection error rate of classification algorithms evaluated by k-fold cross validation.
碩士 === 國立成功大學 === 資訊管理研究所 === 102 === The performance of a classification algorithm is generally evaluated by K-fold cross validation to find the one that has the highest accuracy. Then the model induced from all available data by the best classification algorithm, called full sample model, is used...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/23699989925707105417 |
id |
ndltd-TW-102NCKU5396012 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102NCKU53960122016-03-07T04:10:57Z http://ndltd.ncl.edu.tw/handle/23699989925707105417 A study on the selection error rate of classification algorithms evaluated by k-fold cross validation. 探討K等分交叉驗證法對於分類器錯選率之研究 Chiao-YingLin 林巧盈 碩士 國立成功大學 資訊管理研究所 102 The performance of a classification algorithm is generally evaluated by K-fold cross validation to find the one that has the highest accuracy. Then the model induced from all available data by the best classification algorithm, called full sample model, is used for prediction and interpretation. Since there are no extra data to evaluate the full sample model resulting from the best algorithm, its prediction accuracy can be less than the accuracy of the full sample model induced by the other classification algorithm, and this is called a selection error. This study designs an experiment to calculate and estimate the selection error rate, and attempts to propose a new model for reducing selection error rate. The classification algorithms considered in this study are decision tree, naïve Bayesian classifier, logistic regression, and support vector machine. The experimental results on 30 data sets show that the actual and estimated selection error rates can be greatly different in several cases. The new model that has the median accuracy can reduce the selection error rate without sacrificing the prediction accuracy. Tzu-Tsung Wong 翁慈宗 2014 學位論文 ; thesis 55 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立成功大學 === 資訊管理研究所 === 102 === The performance of a classification algorithm is generally evaluated by K-fold cross validation to find the one that has the highest accuracy. Then the model induced from all available data by the best classification algorithm, called full sample model, is used for prediction and interpretation. Since there are no extra data to evaluate the full sample model resulting from the best algorithm, its prediction accuracy can be less than the accuracy of the full sample model induced by the other classification algorithm, and this is called a selection error. This study designs an experiment to calculate and estimate the selection error rate, and attempts to propose a new model for reducing selection error rate. The classification algorithms considered in this study are decision tree, naïve Bayesian classifier, logistic regression, and support vector machine. The experimental results on 30 data sets show that the actual and estimated selection error rates can be greatly different in several cases. The new model that has the median accuracy can reduce the selection error rate without sacrificing the prediction accuracy.
|
author2 |
Tzu-Tsung Wong |
author_facet |
Tzu-Tsung Wong Chiao-YingLin 林巧盈 |
author |
Chiao-YingLin 林巧盈 |
spellingShingle |
Chiao-YingLin 林巧盈 A study on the selection error rate of classification algorithms evaluated by k-fold cross validation. |
author_sort |
Chiao-YingLin |
title |
A study on the selection error rate of classification algorithms evaluated by k-fold cross validation. |
title_short |
A study on the selection error rate of classification algorithms evaluated by k-fold cross validation. |
title_full |
A study on the selection error rate of classification algorithms evaluated by k-fold cross validation. |
title_fullStr |
A study on the selection error rate of classification algorithms evaluated by k-fold cross validation. |
title_full_unstemmed |
A study on the selection error rate of classification algorithms evaluated by k-fold cross validation. |
title_sort |
study on the selection error rate of classification algorithms evaluated by k-fold cross validation. |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/23699989925707105417 |
work_keys_str_mv |
AT chiaoyinglin astudyontheselectionerrorrateofclassificationalgorithmsevaluatedbykfoldcrossvalidation AT línqiǎoyíng astudyontheselectionerrorrateofclassificationalgorithmsevaluatedbykfoldcrossvalidation AT chiaoyinglin tàntǎokděngfēnjiāochāyànzhèngfǎduìyúfēnlèiqìcuòxuǎnlǜzhīyánjiū AT línqiǎoyíng tàntǎokděngfēnjiāochāyànzhèngfǎduìyúfēnlèiqìcuòxuǎnlǜzhīyánjiū AT chiaoyinglin studyontheselectionerrorrateofclassificationalgorithmsevaluatedbykfoldcrossvalidation AT línqiǎoyíng studyontheselectionerrorrateofclassificationalgorithmsevaluatedbykfoldcrossvalidation |
_version_ |
1718199555436052480 |