Stable Iterative Variable Selection

Motivation: The emergence of datasets with tens of thousands of features, such as high-throughput omics biomedical data, highlights the importance of reducing the feature space into a distilled subset that can truly capture the signal for research and industry by aiding in finding more effective bio...

Full description

Bibliographic Details
Main Authors: Elo, L.L (Author), Klén, R. (Author), Mahmoudian, M. (Author), Venäläinen, M.S (Author)
Format: Article
Language:English
Published: Oxford University Press 2021
Online Access:View Fulltext in Publisher
LEADER 01704nam a2200169Ia 4500
001 10.1093-bioinformatics-btab501
008 220427s2021 CNT 000 0 und d
020 |a 13674803 (ISSN) 
245 1 0 |a Stable Iterative Variable Selection 
260 0 |b Oxford University Press  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1093/bioinformatics/btab501 
520 3 |a Motivation: The emergence of datasets with tens of thousands of features, such as high-throughput omics biomedical data, highlights the importance of reducing the feature space into a distilled subset that can truly capture the signal for research and industry by aiding in finding more effective biomarkers for the question in hand. A good feature set also facilitates building robust predictive models with improved interpretability and convergence of the applied method due to the smaller feature space. Results: Here, we present a robust feature selection method named Stable Iterative Variable Selection (SIVS) and assess its performance over both omics and clinical data types. As a performance assessment metric, we compared the number and goodness of the selected feature using SIVS to those selected by Least Absolute Shrinkage and Selection Operator regression. The results suggested that the feature space selected by SIVS was, on average, 41% smaller, without having a negative effect on the model performance. A similar result was observed for comparison with Boruta and caret RFE. © 2021 The Author(s) 2021. Published by Oxford University Press. 
700 1 |a Elo, L.L.  |e author 
700 1 |a Klén, R.  |e author 
700 1 |a Mahmoudian, M.  |e author 
700 1 |a Venäläinen, M.S.  |e author 
773 |t Bioinformatics