Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory

Nowadays, spam deliveries represent a major problem to benefit from the wide range of Internet-based communication forms. Despite the existence of different well-known intelligent techniques for fighting spam, only some specific implementations of Naïve Bayes algorithm are finally used in real envir...

Full description

Bibliographic Details
Main Authors: N. Pérez-Díaz, D. Ruano-Ordás, F. Fdez-Riverola, J. R. Méndez
Format: Article
Language:English
Published: Hindawi Limited 2016-01-01
Series:Scientific Programming
Online Access:http://dx.doi.org/10.1155/2016/5945192
id doaj-c705b8c75a7348148e86319e91548ab4
record_format Article
spelling doaj-c705b8c75a7348148e86319e91548ab42021-07-02T06:29:22ZengHindawi LimitedScientific Programming1058-92441875-919X2016-01-01201610.1155/2016/59451925945192Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set TheoryN. Pérez-Díaz0D. Ruano-Ordás1F. Fdez-Riverola2J. R. Méndez3Higher Technical School of Computer Engineering, University of Vigo, Polytechnic Building, Campus Universitario As Lagoas s/n, 32004 Ourense, SpainHigher Technical School of Computer Engineering, University of Vigo, Polytechnic Building, Campus Universitario As Lagoas s/n, 32004 Ourense, SpainHigher Technical School of Computer Engineering, University of Vigo, Polytechnic Building, Campus Universitario As Lagoas s/n, 32004 Ourense, SpainHigher Technical School of Computer Engineering, University of Vigo, Polytechnic Building, Campus Universitario As Lagoas s/n, 32004 Ourense, SpainNowadays, spam deliveries represent a major problem to benefit from the wide range of Internet-based communication forms. Despite the existence of different well-known intelligent techniques for fighting spam, only some specific implementations of Naïve Bayes algorithm are finally used in real environments for performance reasons. As long as some of these algorithms suffer from a large number of false positive errors, in this work we propose a rough set postprocessing approach able to significantly improve their accuracy. In order to demonstrate the advantages of the proposed method, we carried out a straightforward study based on a publicly available standard corpus (SpamAssassin), which compares the performance of previously successful well-known antispam classifiers (i.e., Support Vector Machines, AdaBoost, Flexible Bayes, and Naïve Bayes) with and without the application of our developed technique. Results clearly evidence the suitability of our rough set postprocessing approach for increasing the accuracy of previous successful antispam classifiers when working in real scenarios.http://dx.doi.org/10.1155/2016/5945192
collection DOAJ
language English
format Article
sources DOAJ
author N. Pérez-Díaz
D. Ruano-Ordás
F. Fdez-Riverola
J. R. Méndez
spellingShingle N. Pérez-Díaz
D. Ruano-Ordás
F. Fdez-Riverola
J. R. Méndez
Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory
Scientific Programming
author_facet N. Pérez-Díaz
D. Ruano-Ordás
F. Fdez-Riverola
J. R. Méndez
author_sort N. Pérez-Díaz
title Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory
title_short Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory
title_full Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory
title_fullStr Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory
title_full_unstemmed Boosting Accuracy of Classical Machine Learning Antispam Classifiers in Real Scenarios by Applying Rough Set Theory
title_sort boosting accuracy of classical machine learning antispam classifiers in real scenarios by applying rough set theory
publisher Hindawi Limited
series Scientific Programming
issn 1058-9244
1875-919X
publishDate 2016-01-01
description Nowadays, spam deliveries represent a major problem to benefit from the wide range of Internet-based communication forms. Despite the existence of different well-known intelligent techniques for fighting spam, only some specific implementations of Naïve Bayes algorithm are finally used in real environments for performance reasons. As long as some of these algorithms suffer from a large number of false positive errors, in this work we propose a rough set postprocessing approach able to significantly improve their accuracy. In order to demonstrate the advantages of the proposed method, we carried out a straightforward study based on a publicly available standard corpus (SpamAssassin), which compares the performance of previously successful well-known antispam classifiers (i.e., Support Vector Machines, AdaBoost, Flexible Bayes, and Naïve Bayes) with and without the application of our developed technique. Results clearly evidence the suitability of our rough set postprocessing approach for increasing the accuracy of previous successful antispam classifiers when working in real scenarios.
url http://dx.doi.org/10.1155/2016/5945192
work_keys_str_mv AT nperezdiaz boostingaccuracyofclassicalmachinelearningantispamclassifiersinrealscenariosbyapplyingroughsettheory
AT druanoordas boostingaccuracyofclassicalmachinelearningantispamclassifiersinrealscenariosbyapplyingroughsettheory
AT ffdezriverola boostingaccuracyofclassicalmachinelearningantispamclassifiersinrealscenariosbyapplyingroughsettheory
AT jrmendez boostingaccuracyofclassicalmachinelearningantispamclassifiersinrealscenariosbyapplyingroughsettheory
_version_ 1721337191926333440