A Novel Hybrid Feature Selection Algorithm for Hierarchical Classification

Feature selection is a widespread preprocessing step in the data mining field. One of its purposes is to reduce the number of original dataset features to improve a predictive model’s performance. Despite the benefits of feature selection for the classification task, to the best of our kn...

Full description

Bibliographic Details
Main Authors: Helen C. S. C. Lima, Fernando E. B. Otero, Luiz H. C. Merschmann, Marcone J. F. Souza
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9536739/
id doaj-89ea2c2b2e894bb0b1b945ee7a76804a
record_format Article
spelling doaj-89ea2c2b2e894bb0b1b945ee7a76804a2021-09-20T23:00:36ZengIEEEIEEE Access2169-35362021-01-01912727812729210.1109/ACCESS.2021.31123969536739A Novel Hybrid Feature Selection Algorithm for Hierarchical ClassificationHelen C. S. C. Lima0https://orcid.org/0000-0002-2491-4750Fernando E. B. Otero1https://orcid.org/0000-0003-2172-297XLuiz H. C. Merschmann2https://orcid.org/0000-0002-9948-2673Marcone J. F. Souza3https://orcid.org/0000-0002-7141-357XDepartment of Computing, Federal University of Ouro Preto, Ouro Preto, BrazilSchool of Computing, University of Kent, Canterbury, U.K.Department of Applied Computing, Federal University of Lavras, Lavras, BrazilDepartment of Computing, Federal University of Ouro Preto, Ouro Preto, BrazilFeature selection is a widespread preprocessing step in the data mining field. One of its purposes is to reduce the number of original dataset features to improve a predictive model’s performance. Despite the benefits of feature selection for the classification task, to the best of our knowledge, few studies in the literature address feature selection for the hierarchical classification context. This paper proposes a novel feature selection method based on the general variable neighborhood search metaheuristic, combining a filter and a wrapper step, wherein a global model hierarchical classifier evaluates feature subsets. We used twelve datasets from the proteins and images domains to perform computational experiments to validate the effect of the proposed algorithm on classification performance when using two global hierarchical classifiers proposed in the literature. Statistical tests showed that using our method for feature selection led to predictive performances that were consistently better than or equivalent to that obtained by using all features with the benefit of reducing the number of features needed, which justifies its efficiency for the hierarchical classification scenario.https://ieeexplore.ieee.org/document/9536739/Feature selectionhierarchical single-label classificationvariable neighborhood searchfilterwrapper
collection DOAJ
language English
format Article
sources DOAJ
author Helen C. S. C. Lima
Fernando E. B. Otero
Luiz H. C. Merschmann
Marcone J. F. Souza
spellingShingle Helen C. S. C. Lima
Fernando E. B. Otero
Luiz H. C. Merschmann
Marcone J. F. Souza
A Novel Hybrid Feature Selection Algorithm for Hierarchical Classification
IEEE Access
Feature selection
hierarchical single-label classification
variable neighborhood search
filter
wrapper
author_facet Helen C. S. C. Lima
Fernando E. B. Otero
Luiz H. C. Merschmann
Marcone J. F. Souza
author_sort Helen C. S. C. Lima
title A Novel Hybrid Feature Selection Algorithm for Hierarchical Classification
title_short A Novel Hybrid Feature Selection Algorithm for Hierarchical Classification
title_full A Novel Hybrid Feature Selection Algorithm for Hierarchical Classification
title_fullStr A Novel Hybrid Feature Selection Algorithm for Hierarchical Classification
title_full_unstemmed A Novel Hybrid Feature Selection Algorithm for Hierarchical Classification
title_sort novel hybrid feature selection algorithm for hierarchical classification
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description Feature selection is a widespread preprocessing step in the data mining field. One of its purposes is to reduce the number of original dataset features to improve a predictive model’s performance. Despite the benefits of feature selection for the classification task, to the best of our knowledge, few studies in the literature address feature selection for the hierarchical classification context. This paper proposes a novel feature selection method based on the general variable neighborhood search metaheuristic, combining a filter and a wrapper step, wherein a global model hierarchical classifier evaluates feature subsets. We used twelve datasets from the proteins and images domains to perform computational experiments to validate the effect of the proposed algorithm on classification performance when using two global hierarchical classifiers proposed in the literature. Statistical tests showed that using our method for feature selection led to predictive performances that were consistently better than or equivalent to that obtained by using all features with the benefit of reducing the number of features needed, which justifies its efficiency for the hierarchical classification scenario.
topic Feature selection
hierarchical single-label classification
variable neighborhood search
filter
wrapper
url https://ieeexplore.ieee.org/document/9536739/
work_keys_str_mv AT helencsclima anovelhybridfeatureselectionalgorithmforhierarchicalclassification
AT fernandoebotero anovelhybridfeatureselectionalgorithmforhierarchicalclassification
AT luizhcmerschmann anovelhybridfeatureselectionalgorithmforhierarchicalclassification
AT marconejfsouza anovelhybridfeatureselectionalgorithmforhierarchicalclassification
AT helencsclima novelhybridfeatureselectionalgorithmforhierarchicalclassification
AT fernandoebotero novelhybridfeatureselectionalgorithmforhierarchicalclassification
AT luizhcmerschmann novelhybridfeatureselectionalgorithmforhierarchicalclassification
AT marconejfsouza novelhybridfeatureselectionalgorithmforhierarchicalclassification
_version_ 1717373909202894848