rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery

We describe the R package rmcfs that implements an algorithm for ranking features from high dimensional data according to their importance for a given supervised classification task. The ranking is performed prior to addressing the classification task per se. This R package is the new and extended v...

Full description

Bibliographic Details
Main Authors: Michał Dramiński, Jacek Koronacki
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2018-07-01
Series:Journal of Statistical Software
Subjects:
R
Online Access:https://www.jstatsoft.org/index.php/jss/article/view/2621
id doaj-a8b7e3f9c1cb486bbf56e034eeaf995e
record_format Article
spelling doaj-a8b7e3f9c1cb486bbf56e034eeaf995e2020-11-24T22:08:22ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602018-07-0185112810.18637/jss.v085.i121230rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency DiscoveryMichał DramińskiJacek KoronackiWe describe the R package rmcfs that implements an algorithm for ranking features from high dimensional data according to their importance for a given supervised classification task. The ranking is performed prior to addressing the classification task per se. This R package is the new and extended version of the MCFS (Monte Carlo feature selection) algorithm where an early version was published in 2005. The package provides an easy R interface, a set of tools to review results and the new ID (interdependency discovery) component. The algorithm can be used on continuous and/or categorical features (e.g., gene expression and phenotypic data) to produce an objective ranking of features with a statistically well-defined cutoff between informative and non-informative ones. Moreover, the directed ID graph that presents interdependencies between informative features is provided.https://www.jstatsoft.org/index.php/jss/article/view/2621MCFS-IDfeature selectionhigh-dimensional problemsJavaRID graph
collection DOAJ
language English
format Article
sources DOAJ
author Michał Dramiński
Jacek Koronacki
spellingShingle Michał Dramiński
Jacek Koronacki
rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery
Journal of Statistical Software
MCFS-ID
feature selection
high-dimensional problems
Java
R
ID graph
author_facet Michał Dramiński
Jacek Koronacki
author_sort Michał Dramiński
title rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery
title_short rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery
title_full rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery
title_fullStr rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery
title_full_unstemmed rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery
title_sort rmcfs: an r package for monte carlo feature selection and interdependency discovery
publisher Foundation for Open Access Statistics
series Journal of Statistical Software
issn 1548-7660
publishDate 2018-07-01
description We describe the R package rmcfs that implements an algorithm for ranking features from high dimensional data according to their importance for a given supervised classification task. The ranking is performed prior to addressing the classification task per se. This R package is the new and extended version of the MCFS (Monte Carlo feature selection) algorithm where an early version was published in 2005. The package provides an easy R interface, a set of tools to review results and the new ID (interdependency discovery) component. The algorithm can be used on continuous and/or categorical features (e.g., gene expression and phenotypic data) to produce an objective ranking of features with a statistically well-defined cutoff between informative and non-informative ones. Moreover, the directed ID graph that presents interdependencies between informative features is provided.
topic MCFS-ID
feature selection
high-dimensional problems
Java
R
ID graph
url https://www.jstatsoft.org/index.php/jss/article/view/2621
work_keys_str_mv AT michałdraminski rmcfsanrpackageformontecarlofeatureselectionandinterdependencydiscovery
AT jacekkoronacki rmcfsanrpackageformontecarlofeatureselectionandinterdependencydiscovery
_version_ 1725816369550721024