Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms

Machine learning approaches have been successfully applied to many classification and prediction problems. One of the most popular machine learning approaches is decision trees. A main advantage of decision trees is the clarity of the decision model they produce. The ID3 algorithm proposed by Quinla...

Full description

Bibliographic Details
Main Author: Abu-halaweh, Nael Mohammed
Format: Others
Published: Digital Archive @ GSU 2009
Subjects:
ID3
Online Access:http://digitalarchive.gsu.edu/cs_diss/48
http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1048&context=cs_diss
id ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-1048
record_format oai_dc
spelling ndltd-GEORGIA-oai-digitalarchive.gsu.edu-cs_diss-10482013-04-23T03:18:55Z Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms Abu-halaweh, Nael Mohammed Machine learning approaches have been successfully applied to many classification and prediction problems. One of the most popular machine learning approaches is decision trees. A main advantage of decision trees is the clarity of the decision model they produce. The ID3 algorithm proposed by Quinlan forms the basis for many of the decision trees’ application. Trees produced by ID3 are sensitive to small perturbations in training data. To overcome this problem and to handle data uncertainties and spurious precision in data, fuzzy ID3 integrated fuzzy set theory and ideas from fuzzy logic with ID3. Several fuzzy decision trees algorithms and tools exist. However, existing tools are slow, produce a large number of rules and/or lack the support for automatic fuzzification of input data. These limitations make those tools unsuitable for a variety of applications including those with many features and real time ones such as intrusion detection. In addition, the large number of rules produced by these tools renders the generated decision model un-interpretable. In this research work, we proposed an improved version of the fuzzy ID3 algorithm. We also introduced a new method for reducing the number of fuzzy rules generated by Fuzzy ID3. In addition we applied fuzzy decision trees to the classification of real and pseudo microRNA precursors. Our experimental results showed that our improved fuzzy ID3 can achieve better classification accuracy and is more efficient than the original fuzzy ID3 algorithm, and that fuzzy decision trees can outperform several existing machine learning algorithms on a wide variety of datasets. In addition our experiments showed that our developed fuzzy rule reduction method resulted in a significant reduction in the number of produced rules, consequently, improving the produced decision model comprehensibility and reducing the fuzzy decision tree execution time. This reduction in the number of rules was accompanied with a slight improvement in the classification accuracy of the resulting fuzzy decision tree. In addition, when applied to the microRNA prediction problem, fuzzy decision tree achieved better results than other machine learning approaches applied to the same problem including Random Forest, C4.5, SVM and Knn. 2009-12-02 text application/pdf http://digitalarchive.gsu.edu/cs_diss/48 http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1048&context=cs_diss Computer Science Dissertations Digital Archive @ GSU Decision tree ID3 Fuzzy ID3 FID3 Classification Fuzzy MicroRNA Prediction Rule-set reduction Machine learning Pre-microRNA Computer Sciences
collection NDLTD
format Others
sources NDLTD
topic Decision tree
ID3
Fuzzy ID3
FID3
Classification
Fuzzy
MicroRNA
Prediction
Rule-set reduction
Machine learning
Pre-microRNA
Computer Sciences
spellingShingle Decision tree
ID3
Fuzzy ID3
FID3
Classification
Fuzzy
MicroRNA
Prediction
Rule-set reduction
Machine learning
Pre-microRNA
Computer Sciences
Abu-halaweh, Nael Mohammed
Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms
description Machine learning approaches have been successfully applied to many classification and prediction problems. One of the most popular machine learning approaches is decision trees. A main advantage of decision trees is the clarity of the decision model they produce. The ID3 algorithm proposed by Quinlan forms the basis for many of the decision trees’ application. Trees produced by ID3 are sensitive to small perturbations in training data. To overcome this problem and to handle data uncertainties and spurious precision in data, fuzzy ID3 integrated fuzzy set theory and ideas from fuzzy logic with ID3. Several fuzzy decision trees algorithms and tools exist. However, existing tools are slow, produce a large number of rules and/or lack the support for automatic fuzzification of input data. These limitations make those tools unsuitable for a variety of applications including those with many features and real time ones such as intrusion detection. In addition, the large number of rules produced by these tools renders the generated decision model un-interpretable. In this research work, we proposed an improved version of the fuzzy ID3 algorithm. We also introduced a new method for reducing the number of fuzzy rules generated by Fuzzy ID3. In addition we applied fuzzy decision trees to the classification of real and pseudo microRNA precursors. Our experimental results showed that our improved fuzzy ID3 can achieve better classification accuracy and is more efficient than the original fuzzy ID3 algorithm, and that fuzzy decision trees can outperform several existing machine learning algorithms on a wide variety of datasets. In addition our experiments showed that our developed fuzzy rule reduction method resulted in a significant reduction in the number of produced rules, consequently, improving the produced decision model comprehensibility and reducing the fuzzy decision tree execution time. This reduction in the number of rules was accompanied with a slight improvement in the classification accuracy of the resulting fuzzy decision tree. In addition, when applied to the microRNA prediction problem, fuzzy decision tree achieved better results than other machine learning approaches applied to the same problem including Random Forest, C4.5, SVM and Knn.
author Abu-halaweh, Nael Mohammed
author_facet Abu-halaweh, Nael Mohammed
author_sort Abu-halaweh, Nael Mohammed
title Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms
title_short Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms
title_full Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms
title_fullStr Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms
title_full_unstemmed Integrating Information Theory Measures and a Novel Rule-Set-Reduction Tech-nique to Improve Fuzzy Decision Tree Induction Algorithms
title_sort integrating information theory measures and a novel rule-set-reduction tech-nique to improve fuzzy decision tree induction algorithms
publisher Digital Archive @ GSU
publishDate 2009
url http://digitalarchive.gsu.edu/cs_diss/48
http://digitalarchive.gsu.edu/cgi/viewcontent.cgi?article=1048&context=cs_diss
work_keys_str_mv AT abuhalawehnaelmohammed integratinginformationtheorymeasuresandanovelrulesetreductiontechniquetoimprovefuzzydecisiontreeinductionalgorithms
_version_ 1716583957828468736