Improve the Classification Performance for Decision Tree by Population-based Approaches with Ensemble

碩士 === 華梵大學 === 資訊管理學系碩士班 === 97 === Data mining techniques have been widely used in prediction or classification problems. The decision trees algorithm (DT) that can provides rule-based tree structure is one of the most popular among them and can be applied to various areas. Nevertheless, different...

Full description

Bibliographic Details
Main Authors: Wei-Lan Tasi, 蔡維倫
Other Authors: Shih-Wei Lin
Format: Others
Language:zh-TW
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/90102780678990829408
Description
Summary:碩士 === 華梵大學 === 資訊管理學系碩士班 === 97 === Data mining techniques have been widely used in prediction or classification problems. The decision trees algorithm (DT) that can provides rule-based tree structure is one of the most popular among them and can be applied to various areas. Nevertheless, different problems may require different parameters when applying DT to build the model and the parameter settings will influence classification result. On the other hand, a dataset may contain many features; however, not all features are beneficial for the model. If the feature selection did not perform may increasing cost and reduce DT learning ability. Therefore, scatter search (SS), genetic algorithm (GA) and particle swarm optimization (PSO) are proposed to select the beneficial subset of features and to obtain the better parameters which will result in a better classifications. The above three meta-heuristic algorithms mentioned above all have their its own strength and weakness. If these algorithms can work together, it is expected that the better results can be obtained. This is so called ensemble. This paper is proposed the ensemble to further enhance the prediction or classification accuracy rate. In order to evaluate the proposed approaches, datasets in UCI (University of California) are planned to evaluate the performance of the proposed approaches. The proposed three meta-heuristic methods-based DT algorithm can find the best parameters and feature subset when face various problems, and provide the higher classification accuracy rate.