Data Reduction in Bankruptcy Prediction

碩士 === 國立中正大學 === 會計與資訊科技研究所 === 97 === Prior researches of using data mining techniques in bankruptcy prediction focus mainly on constructing effective prediction models. Particularly, many of them develop hybrid models based on feature selection in the pre-processing stage. However, very few of th...

Full description

Bibliographic Details
Main Authors: Kai-Chun Cheng, 鄭凱駿
Other Authors: Chih-fong Tsai
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/45639385298165269954
id ndltd-TW-097CCU05736037
record_format oai_dc
spelling ndltd-TW-097CCU057360372016-05-04T04:25:48Z http://ndltd.ncl.edu.tw/handle/45639385298165269954 Data Reduction in Bankruptcy Prediction 應用資料精減於破產預測之研究 Kai-Chun Cheng 鄭凱駿 碩士 國立中正大學 會計與資訊科技研究所 97 Prior researches of using data mining techniques in bankruptcy prediction focus mainly on constructing effective prediction models. Particularly, many of them develop hybrid models based on feature selection in the pre-processing stage. However, very few of them emphasize on data reduction. Data reduction can make the training dataset cleaner and reduce outlier data, which can improve prediction accuracy. Therefore, the purpose of this thesis is to build up a data reduction method by using K-means to find the center of each cluster, and calculate the distance from all the data in a specific cluster to its cluster center. Then, we reduce the farther data in different percentages as the outlier data. We use four commonly used datasets in the bankruptcy prediction domain and employ neural networks, decision trees, logistic regression, and support vector machines as the prediction models after data reduction. The experimental results show that when the model trained by the four classifiers using four datasets after data reduction, the accuracy in general is higher than the model without data reduction. Moreover, the accuracy becomes higher when the reduction percentage increases. Chih-fong Tsai 蔡志豐 2009 學位論文 ; thesis 71 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 會計與資訊科技研究所 === 97 === Prior researches of using data mining techniques in bankruptcy prediction focus mainly on constructing effective prediction models. Particularly, many of them develop hybrid models based on feature selection in the pre-processing stage. However, very few of them emphasize on data reduction. Data reduction can make the training dataset cleaner and reduce outlier data, which can improve prediction accuracy. Therefore, the purpose of this thesis is to build up a data reduction method by using K-means to find the center of each cluster, and calculate the distance from all the data in a specific cluster to its cluster center. Then, we reduce the farther data in different percentages as the outlier data. We use four commonly used datasets in the bankruptcy prediction domain and employ neural networks, decision trees, logistic regression, and support vector machines as the prediction models after data reduction. The experimental results show that when the model trained by the four classifiers using four datasets after data reduction, the accuracy in general is higher than the model without data reduction. Moreover, the accuracy becomes higher when the reduction percentage increases.
author2 Chih-fong Tsai
author_facet Chih-fong Tsai
Kai-Chun Cheng
鄭凱駿
author Kai-Chun Cheng
鄭凱駿
spellingShingle Kai-Chun Cheng
鄭凱駿
Data Reduction in Bankruptcy Prediction
author_sort Kai-Chun Cheng
title Data Reduction in Bankruptcy Prediction
title_short Data Reduction in Bankruptcy Prediction
title_full Data Reduction in Bankruptcy Prediction
title_fullStr Data Reduction in Bankruptcy Prediction
title_full_unstemmed Data Reduction in Bankruptcy Prediction
title_sort data reduction in bankruptcy prediction
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/45639385298165269954
work_keys_str_mv AT kaichuncheng datareductioninbankruptcyprediction
AT zhèngkǎijùn datareductioninbankruptcyprediction
AT kaichuncheng yīngyòngzīliàojīngjiǎnyúpòchǎnyùcèzhīyánjiū
AT zhèngkǎijùn yīngyòngzīliàojīngjiǎnyúpòchǎnyùcèzhīyánjiū
_version_ 1718258238244257792