Causality-Based Attribute Weighting via Information Flow and Genetic Algorithm for Naive Bayes Classifier

Naive Bayes classifier (NBC) is an effective classification technique in data mining and machine learning, which is based on the attribute conditional independence assumption. However, this assumption rarely holds true in real-world applications, so numerous researches have been made to alleviate th...

Full description

Bibliographic Details
Main Authors: Ming Li, Kefeng Liu
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8869768/
Description
Summary:Naive Bayes classifier (NBC) is an effective classification technique in data mining and machine learning, which is based on the attribute conditional independence assumption. However, this assumption rarely holds true in real-world applications, so numerous researches have been made to alleviate the assumption by attribute weighting. To the best of our knowledge, almost all studies have calculated attribute weights according to correlation measure or classification accuracy. In this paper, we propose a novel causality-based attribute weighting method to establish the weighted NBC called IFG-WNBC, where causal information flow (IF) theory and genetic algorithm (GA) are adopted to search for optimal weights. The introduction of IF produces a bran-new weight measure criterion from the angle of causality other than correlation. The population initialization in GA is also improved with IF-based weights for efficient optimization. Multi-set of comparison experiments on UCI data sets demonstrate that IFG-WNBC achieves superiority over classic NBC and other common weighted NBC algorithms in classification accuracy and running time.
ISSN:2169-3536