Feature selection for chemical sensor arrays using mutual information.

We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best featur...

Full description

Bibliographic Details
Main Authors: X Rosalind Wang, Joseph T Lizier, Thomas Nowotny, Amalia Z Berna, Mikhail Prokopenko, Stephen C Trowell
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24595058/pdf/?tool=EBI
id doaj-65b88b8dddd34bd7bddccbc9fcf82e7a
record_format Article
spelling doaj-65b88b8dddd34bd7bddccbc9fcf82e7a2021-03-04T09:46:43ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0193e8984010.1371/journal.pone.0089840Feature selection for chemical sensor arrays using mutual information.X Rosalind WangJoseph T LizierThomas NowotnyAmalia Z BernaMikhail ProkopenkoStephen C TrowellWe address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24595058/pdf/?tool=EBI
collection DOAJ
language English
format Article
sources DOAJ
author X Rosalind Wang
Joseph T Lizier
Thomas Nowotny
Amalia Z Berna
Mikhail Prokopenko
Stephen C Trowell
spellingShingle X Rosalind Wang
Joseph T Lizier
Thomas Nowotny
Amalia Z Berna
Mikhail Prokopenko
Stephen C Trowell
Feature selection for chemical sensor arrays using mutual information.
PLoS ONE
author_facet X Rosalind Wang
Joseph T Lizier
Thomas Nowotny
Amalia Z Berna
Mikhail Prokopenko
Stephen C Trowell
author_sort X Rosalind Wang
title Feature selection for chemical sensor arrays using mutual information.
title_short Feature selection for chemical sensor arrays using mutual information.
title_full Feature selection for chemical sensor arrays using mutual information.
title_fullStr Feature selection for chemical sensor arrays using mutual information.
title_full_unstemmed Feature selection for chemical sensor arrays using mutual information.
title_sort feature selection for chemical sensor arrays using mutual information.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2014-01-01
description We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays.
url https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24595058/pdf/?tool=EBI
work_keys_str_mv AT xrosalindwang featureselectionforchemicalsensorarraysusingmutualinformation
AT josephtlizier featureselectionforchemicalsensorarraysusingmutualinformation
AT thomasnowotny featureselectionforchemicalsensorarraysusingmutualinformation
AT amaliazberna featureselectionforchemicalsensorarraysusingmutualinformation
AT mikhailprokopenko featureselectionforchemicalsensorarraysusingmutualinformation
AT stephenctrowell featureselectionforchemicalsensorarraysusingmutualinformation
_version_ 1714806942656888832