Feature selection for chemical sensor arrays using mutual information.
We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best featur...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2014-01-01
|
Series: | PLoS ONE |
Online Access: | https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24595058/pdf/?tool=EBI |
id |
doaj-65b88b8dddd34bd7bddccbc9fcf82e7a |
---|---|
record_format |
Article |
spelling |
doaj-65b88b8dddd34bd7bddccbc9fcf82e7a2021-03-04T09:46:43ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0193e8984010.1371/journal.pone.0089840Feature selection for chemical sensor arrays using mutual information.X Rosalind WangJoseph T LizierThomas NowotnyAmalia Z BernaMikhail ProkopenkoStephen C TrowellWe address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24595058/pdf/?tool=EBI |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
X Rosalind Wang Joseph T Lizier Thomas Nowotny Amalia Z Berna Mikhail Prokopenko Stephen C Trowell |
spellingShingle |
X Rosalind Wang Joseph T Lizier Thomas Nowotny Amalia Z Berna Mikhail Prokopenko Stephen C Trowell Feature selection for chemical sensor arrays using mutual information. PLoS ONE |
author_facet |
X Rosalind Wang Joseph T Lizier Thomas Nowotny Amalia Z Berna Mikhail Prokopenko Stephen C Trowell |
author_sort |
X Rosalind Wang |
title |
Feature selection for chemical sensor arrays using mutual information. |
title_short |
Feature selection for chemical sensor arrays using mutual information. |
title_full |
Feature selection for chemical sensor arrays using mutual information. |
title_fullStr |
Feature selection for chemical sensor arrays using mutual information. |
title_full_unstemmed |
Feature selection for chemical sensor arrays using mutual information. |
title_sort |
feature selection for chemical sensor arrays using mutual information. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2014-01-01 |
description |
We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. |
url |
https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24595058/pdf/?tool=EBI |
work_keys_str_mv |
AT xrosalindwang featureselectionforchemicalsensorarraysusingmutualinformation AT josephtlizier featureselectionforchemicalsensorarraysusingmutualinformation AT thomasnowotny featureselectionforchemicalsensorarraysusingmutualinformation AT amaliazberna featureselectionforchemicalsensorarraysusingmutualinformation AT mikhailprokopenko featureselectionforchemicalsensorarraysusingmutualinformation AT stephenctrowell featureselectionforchemicalsensorarraysusingmutualinformation |
_version_ |
1714806942656888832 |