Sensitivity analysis of predictive data analytic models to attributes

Classification algorithms represent a rich set of tools, which train a classification model from a given training and test set, to classify previously unseen test instances. Although existing methods have studied classification algorithm performance with respect to feature selection, noise condition...

Full description

Bibliographic Details
Other Authors: Chiou, James (author)
Format: Others
Language:English
Published: Florida Atlantic University
Subjects:
Online Access:http://purl.flvc.org/fau/fd/FA00004274
http://purl.flvc.org/fau/fd/FA00004274
Description
Summary:Classification algorithms represent a rich set of tools, which train a classification model from a given training and test set, to classify previously unseen test instances. Although existing methods have studied classification algorithm performance with respect to feature selection, noise condition, and sample distributions, our existing studies have not addressed an important issue on the classification algorithm performance relating to feature deletion and addition. In this thesis, we carry out sensitive study of classification algorithms by using feature deletion and addition. Three types of classifiers: (1) weak classifiers; (2) generic and strong classifiers; and (3) ensemble classifiers are validated on three types of data (1) feature dimension data, (2) gene expression data and (3) biomedical document data. In the experiments, we continuously add redundant features to the training and test set in order to observe the classification algorithm performance, and also continuously remove features to find the performance of the underlying classifiers. Our studies draw a number of important findings, which will help data mining and machine learning community under the genuine performance of common classification algorithms on real-world data. === Includes bibliography. === Thesis (M.S.)--Florida Atlantic University, 2014. === FAU Electronic Theses and Dissertations Collection