Summary: | 碩士 === 臺中技術學院 === 流通管理系碩士班 === 98 === With the advance of technology and popularity of the Internet, the amount of information around the world has grown geometrically. In response to the rapid growth of the information, we have huge database manage systems in fields for different applications. To extract important knowledge relies on efficient data mining techniques. Among data mining techniques, decision tree is an important tool that is possible to identity existing causal relationships. Traditional decision tree uses univariate attributes to classify the data, and further constructs a classification model which is usually huge. However, due to the neglect of the correlation between feature attributes, it may result in low efficiency of inductive learning by using similar classification rules repeatedly. In order to improve the efficiency of classification, the study proposes a strategy which adapts PCA (principal component analysis) to simplify the classification. By the communality and explanation resulted from PCA, we can decide an appropriate set of feature attributes. Therefore, a multivariate classifier is produced. We then use this multivariate hybrid attribute for the root of the constructed decision tree. Finally, the UCI database is used to evaluate the method of the study. A comparison between proposed method (multivariate hybrid attributes) and traditional C4.5 (univariate attribute) is made as well.
|