Summary: | 碩士 === 國立中央大學 === 工業管理研究所 === 103 === Due to the progressing of the science and technology, the data is growing rapidly. The speed of classifier has become an important part of data mining. Naïve Bayes classifier model is a simple and practical method of classification, it is based on applying Bayes’ theorem with strong independence assumptions between the features. But this assumption is not very realistic as in many real situations.
We propose a classifier method, PC-Naïve, which is based on Naïve Bayes classifier. We keep the simple and fast advantages of the Naïve Bays classifier and relax vital assumption for independence of the Naïve Bayes classifie model. We use Principal components analysis to transform the original data, make the attributes mutual linearly independence. Then discretization the transform data and calculate the prior and conditional probability. Final we can get the posterior probability and classifier the data.
We have used the examples to present the classifier procedures in our research and compare the accuracy with four models, including PC-Naïve model, tradition Naïve Bayes model, Decision Tree model and Stepwise Logistic Regression model. At the end, we have discuss the accuracy of different dimension and discretization methods.
|