Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms
Machine learning approaches have been successfully applied to many classification and prediction problems. One of the most popular machine learning approaches is decision trees. A main advantage of decision trees is the clarity of the decision model they produce. The ID3 algorithm proposed by Quinla...
Main Author: | |
---|---|
Format: | Others |
Published: |
Digital Archive @ GSU
2009
|
Subjects: | |
Online Access: | http://digitalarchive.gsu.edu/cs_diss/48 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1048&context=cs_diss |
id |
ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-1048 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-10482013-04-23T03:18:55Z Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms Abu-halaweh, Nael Mohammed Machine learning approaches have been successfully applied to many classification and prediction problems. One of the most popular machine learning approaches is decision trees. A main advantage of decision trees is the clarity of the decision model they produce. The ID3 algorithm proposed by Quinlan forms the basis for many of the decision trees’ application. Trees produced by ID3 are sensitive to small perturbations in training data. To overcome this problem and to handle data uncertainties and spurious precision in data, fuzzy ID3 integrated fuzzy set theory and ideas from fuzzy logic with ID3. Several fuzzy decision trees algorithms and tools exist. However, existing tools are slow, produce a large number of rules and/or lack the support for automatic fuzzification of input data. These limitations make those tools unsuitable for a variety of applications including those with many features and real time ones such as intrusion detection. In addition, the large number of rules produced by these tools renders the generated decision model un-interpretable. In this research work, we proposed an improved version of the fuzzy ID3 algorithm. We also introduced a new method for reducing the number of fuzzy rules generated by Fuzzy ID3. In addition we applied fuzzy decision trees to the classification of real and pseudo microRNA precursors. Our experimental results showed that our improved fuzzy ID3 can achieve better classification accuracy and is more efficient than the original fuzzy ID3 algorithm, and that fuzzy decision trees can outperform several existing machine learning algorithms on a wide variety of datasets. In addition our experiments showed that our developed fuzzy rule reduction method resulted in a significant reduction in the number of produced rules, consequently, improving the produced decision model comprehensibility and reducing the fuzzy decision tree execution time. This reduction in the number of rules was accompanied with a slight improvement in the classification accuracy of the resulting fuzzy decision tree. In addition, when applied to the microRNA prediction problem, fuzzy decision tree achieved better results than other machine learning approaches applied to the same problem including Random Forest, C4.5, SVM and Knn. 2009-12-02 text application/pdf http://digitalarchive.gsu.edu/cs_diss/48 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1048&context=cs_diss Computer Science Dissertations Digital Archive @ GSU Decision tree ID3 Fuzzy ID3 FID3 Classification Fuzzy MicroRNA Prediction Rule-set reduction Machine learning Pre-microRNA Computer Sciences |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
Decision tree ID3 Fuzzy ID3 FID3 Classification Fuzzy MicroRNA Prediction Rule-set reduction Machine learning Pre-microRNA Computer Sciences |
spellingShingle |
Decision tree ID3 Fuzzy ID3 FID3 Classification Fuzzy MicroRNA Prediction Rule-set reduction Machine learning Pre-microRNA Computer Sciences Abu-halaweh, Nael Mohammed Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms |
description |
Machine learning approaches have been successfully applied to many classification and prediction problems. One of the most popular machine learning approaches is decision trees. A main advantage of decision trees is the clarity of the decision model they produce. The ID3 algorithm proposed by Quinlan forms the basis for many of the decision trees’ application. Trees produced by ID3 are sensitive to small perturbations in training data. To overcome this problem and to handle data uncertainties and spurious precision in data, fuzzy ID3 integrated fuzzy set theory and ideas from fuzzy logic with ID3. Several fuzzy decision trees algorithms and tools exist. However, existing tools are slow, produce a large number of rules and/or lack the support for automatic fuzzification of input data. These limitations make those tools unsuitable for a variety of applications including those with many features and real time ones such as intrusion detection. In addition, the large number of rules produced by these tools renders the generated decision model un-interpretable. In this research work, we proposed an improved version of the fuzzy ID3 algorithm. We also introduced a new method for reducing the number of fuzzy rules generated by Fuzzy ID3. In addition we applied fuzzy decision trees to the classification of real and pseudo microRNA precursors. Our experimental results showed that our improved fuzzy ID3 can achieve better classification accuracy and is more efficient than the original fuzzy ID3 algorithm, and that fuzzy decision trees can outperform several existing machine learning algorithms on a wide variety of datasets. In addition our experiments showed that our developed fuzzy rule reduction method resulted in a significant reduction in the number of produced rules, consequently, improving the produced decision model comprehensibility and reducing the fuzzy decision tree execution time. This reduction in the number of rules was accompanied with a slight improvement in the classification accuracy of the resulting fuzzy decision tree. In addition, when applied to the microRNA prediction problem, fuzzy decision tree achieved better results than other machine learning approaches applied to the same problem including Random Forest, C4.5, SVM and Knn. |
author |
Abu-halaweh, Nael Mohammed |
author_facet |
Abu-halaweh, Nael Mohammed |
author_sort |
Abu-halaweh, Nael Mohammed |
title |
Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms |
title_short |
Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms |
title_full |
Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms |
title_fullStr |
Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms |
title_full_unstemmed |
Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms |
title_sort |
integrating information theory measures and a novel rule-set-reduction tech-nique to improve fuzzy decision tree induction algorithms |
publisher |
Digital Archive @ GSU |
publishDate |
2009 |
url |
http://digitalarchive.gsu.edu/cs_diss/48 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1048&context=cs_diss |
work_keys_str_mv |
AT abuhalawehnaelmohammed integratinginformationtheorymeasuresandanovelrulesetreductiontechniquetoimprovefuzzydecisiontreeinductionalgorithms |
_version_ |
1716583957828468736 |