Consistency Analysis of Hybrid Discretization Method among Classification Algorithms
碩士 === 國立成功大學 === 資訊管理研究所 === 103 === Discretization is one of the major approaches for processing continuous attributes for classification. However, the resulting accuracies for a data set discretized by various discretization methods may be greatly different. Hybrid discretization method was propo...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2015
|
Online Access: | http://ndltd.ncl.edu.tw/handle/41008670954809817719 |
id |
ndltd-TW-103NCKU5396011 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-103NCKU53960112016-05-22T04:40:56Z http://ndltd.ncl.edu.tw/handle/41008670954809817719 Consistency Analysis of Hybrid Discretization Method among Classification Algorithms 不同分類器的混合型離散化方法之一致性分析 Bo-HanHuang 黃柏翰 碩士 國立成功大學 資訊管理研究所 103 Discretization is one of the major approaches for processing continuous attributes for classification. However, the resulting accuracies for a data set discretized by various discretization methods may be greatly different. Hybrid discretization method was proposed recently, and it can generally achieve a better performance for naïve Bayesian classifier than unified discretization. A study has developed a hybrid discretization method applicable for classifiers such that it can determine the discretization method for each attribute in data preprocessing step. However, the results of that study demonstrated that it cannot improve the performance of decision trees. Therefore, the objective of this study is to investigate the consistency of hybrid discretization results among classification algorithms. This study proposes two approaches to perform consistency analysis. The first approach is to identify whether the best hybrid discretization results for a classification algorithm can improve the performance of the others. A new measure is also proposed to evaluate the consistency of the best hybrid discretization results of two classification algorithms. The classification tools for testing our methods are decision trees, naïve Bayesian classifiers, and rule-based classifiers. The experimental results on 30 data sets show that the best hybrid discretization results for an algorithm seldom improve the performance of the others. Moreover, most of the values of the consistency measure are low. These results suggest that the characteristics of a classification algorithm should be considered in designing a hybrid discretization method in data preprocessing. Tzu-Tsung Wong 翁慈宗 2015 學位論文 ; thesis 50 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立成功大學 === 資訊管理研究所 === 103 === Discretization is one of the major approaches for processing continuous attributes for classification. However, the resulting accuracies for a data set discretized by various discretization methods may be greatly different. Hybrid discretization method was proposed recently, and it can generally achieve a better performance for naïve Bayesian classifier than unified discretization. A study has developed a hybrid discretization method applicable for classifiers such that it can determine the discretization method for each attribute in data preprocessing step. However, the results of that study demonstrated that it cannot improve the performance of decision trees. Therefore, the objective of this study is to investigate the consistency of hybrid discretization results among classification algorithms. This study proposes two approaches to perform consistency analysis. The first approach is to identify whether the best hybrid discretization results for a classification algorithm can improve the performance of the others. A new measure is also proposed to evaluate the consistency of the best hybrid discretization results of two classification algorithms. The classification tools for testing our methods are decision trees, naïve Bayesian classifiers, and rule-based classifiers. The experimental results on 30 data sets show that the best hybrid discretization results for an algorithm seldom improve the performance of the others. Moreover, most of the values of the consistency measure are low. These results suggest that the characteristics of a classification algorithm should be considered in designing a hybrid discretization method in data preprocessing.
|
author2 |
Tzu-Tsung Wong |
author_facet |
Tzu-Tsung Wong Bo-HanHuang 黃柏翰 |
author |
Bo-HanHuang 黃柏翰 |
spellingShingle |
Bo-HanHuang 黃柏翰 Consistency Analysis of Hybrid Discretization Method among Classification Algorithms |
author_sort |
Bo-HanHuang |
title |
Consistency Analysis of Hybrid Discretization Method among Classification Algorithms |
title_short |
Consistency Analysis of Hybrid Discretization Method among Classification Algorithms |
title_full |
Consistency Analysis of Hybrid Discretization Method among Classification Algorithms |
title_fullStr |
Consistency Analysis of Hybrid Discretization Method among Classification Algorithms |
title_full_unstemmed |
Consistency Analysis of Hybrid Discretization Method among Classification Algorithms |
title_sort |
consistency analysis of hybrid discretization method among classification algorithms |
publishDate |
2015 |
url |
http://ndltd.ncl.edu.tw/handle/41008670954809817719 |
work_keys_str_mv |
AT bohanhuang consistencyanalysisofhybriddiscretizationmethodamongclassificationalgorithms AT huángbǎihàn consistencyanalysisofhybriddiscretizationmethodamongclassificationalgorithms AT bohanhuang bùtóngfēnlèiqìdehùnhéxínglísànhuàfāngfǎzhīyīzhìxìngfēnxī AT huángbǎihàn bùtóngfēnlèiqìdehùnhéxínglísànhuàfāngfǎzhīyīzhìxìngfēnxī |
_version_ |
1718277153653522432 |