Summary: | 碩士 === 華梵大學 === 資訊管理學系碩士班 === 96 === Traditionally, decision tree performs as not well as novel classifiers like BPN or
SVM in aspect of accuracy. However, the human readable result generated by
decision tree truly makes it capable to fulfill the requirement of some expert domain
like medical、business…etc. But when the dataset has severe data confliction in it or
there are large amount of information hidden in it, the decision tree will turns out a
big size tree with high depth which makes it hard to understand by human. Although
decision tree yields out average performance among usual applications, the
inefficiency of binary cut which used to split the continuous data is still an academic
topic, and there are several methods claims to facilitate the performance of continuous
attribute splitting, but eventually the improvement lacks of generalization. In this
research, we proposed a meta-heuristics method (e.g. simulated annealing) with a
proper designed Objective Function and two pruning method (In-Build Pruning, Post
Pruning) in the process the tree generation were joined. We choose 10 UCI datasets as
our original data. All of the experiments were follow the procedure of 10-folds cross
validation method. Our proposed method finally generated much more simplified
decision trees with relatively small tree size & almost half of the tree depth compared
with C4.5. In addition, the accuracy has no significant differences.
|