Apply Genetic Algorithms to Discretization

碩士 === 國立中央大學 === 資訊管理研究所 === 93 === Discretization of continuous attributes is one of main problems needed to be solved in data mining. Discretization can be viewed as the problem of selecting a set of cut points of attributes. Past studies concentrated on finding a minimal set of cut points and ma...

Full description

Bibliographic Details
Main Authors: Hsien-Lian Chiu, 邱獻良
Other Authors: Jiah-Shing Chen
Format: Others
Language:en_US
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/92907504347874525955
Description
Summary:碩士 === 國立中央大學 === 資訊管理研究所 === 93 === Discretization of continuous attributes is one of main problems needed to be solved in data mining. Discretization can be viewed as the problem of selecting a set of cut points of attributes. Past studies concentrated on finding a minimal set of cut points and maintaining the fidelity of the original data in discretization. However, maintaining too high consistency may yield too many unnecessary rules which are not general. Generality is an important aspect to discretization because general rules are usually useful and easy to interpret. In this paper, a genetic algorithm based approach is proposed and the aim is to efficiently find an optimal compromise solution of discretization between generality and consistency criterions. Two sets of experiments on some data sets from UCI Machine Learning Repository by this approach were done. The empirical results have demonstrated that our GA approach can generate the simplest discretization result according to the requirement of the decision maker and help the classifier to induce general rules with high predictive accuracy.