Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments

In this paper, a novel feature selection method called Robust Proportional Overlapping Score (RPOS), for microarray gene expression datasets has been proposed, by utilizing the robust measure of dispersion, i.e., Median Absolute Deviation (MAD). This method robustly identifies the most discriminativ...

Full description

Bibliographic Details
Main Authors: Muhammad Hamraz, Naz Gul, Mushtaq Raza, Dost Muhammad Khan, Umair Khalil, Seema Zubair, Zardad Khan
Format: Article
Language:English
Published: PeerJ Inc. 2021-06-01
Series:PeerJ Computer Science
Subjects:
Online Access:https://peerj.com/articles/cs-562.pdf
id doaj-8ad7cac54c9c4d67ae2e1b7690d5deb0
record_format Article
spelling doaj-8ad7cac54c9c4d67ae2e1b7690d5deb02021-06-03T15:05:17ZengPeerJ Inc.PeerJ Computer Science2376-59922021-06-017e56210.7717/peerj-cs.562Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experimentsMuhammad Hamraz0Naz Gul1Mushtaq Raza2Dost Muhammad Khan3Umair Khalil4Seema Zubair5Zardad Khan6Department of Statistics, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Statistics, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Sciences, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Statistics, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Statistics, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Mathematics, Statistics and Computer Science, University of Agriculture Peshawar, Peshawar, PakistanDepartment of Statistics, Abdul Wali Khan University Mardan, Mardan, PakistanIn this paper, a novel feature selection method called Robust Proportional Overlapping Score (RPOS), for microarray gene expression datasets has been proposed, by utilizing the robust measure of dispersion, i.e., Median Absolute Deviation (MAD). This method robustly identifies the most discriminative genes by considering the overlapping scores of the gene expression values for binary class problems. Genes with a high degree of overlap between classes are discarded and the ones that discriminate between the classes are selected. The results of the proposed method are compared with five state-of-the-art gene selection methods based on classification error, Brier score, and sensitivity, by considering eleven gene expression datasets. Classification of observations for different sets of selected genes by the proposed method is carried out by three different classifiers, i.e., random forest, k-nearest neighbors (k-NN), and support vector machine (SVM). Box-plots and stability scores of the results are also shown in this paper. The results reveal that in most of the cases the proposed method outperforms the other methods.https://peerj.com/articles/cs-562.pdfOverlapping analysisFeature selectionBinary classificationFunctional genomic
collection DOAJ
language English
format Article
sources DOAJ
author Muhammad Hamraz
Naz Gul
Mushtaq Raza
Dost Muhammad Khan
Umair Khalil
Seema Zubair
Zardad Khan
spellingShingle Muhammad Hamraz
Naz Gul
Mushtaq Raza
Dost Muhammad Khan
Umair Khalil
Seema Zubair
Zardad Khan
Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments
PeerJ Computer Science
Overlapping analysis
Feature selection
Binary classification
Functional genomic
author_facet Muhammad Hamraz
Naz Gul
Mushtaq Raza
Dost Muhammad Khan
Umair Khalil
Seema Zubair
Zardad Khan
author_sort Muhammad Hamraz
title Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments
title_short Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments
title_full Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments
title_fullStr Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments
title_full_unstemmed Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments
title_sort robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments
publisher PeerJ Inc.
series PeerJ Computer Science
issn 2376-5992
publishDate 2021-06-01
description In this paper, a novel feature selection method called Robust Proportional Overlapping Score (RPOS), for microarray gene expression datasets has been proposed, by utilizing the robust measure of dispersion, i.e., Median Absolute Deviation (MAD). This method robustly identifies the most discriminative genes by considering the overlapping scores of the gene expression values for binary class problems. Genes with a high degree of overlap between classes are discarded and the ones that discriminate between the classes are selected. The results of the proposed method are compared with five state-of-the-art gene selection methods based on classification error, Brier score, and sensitivity, by considering eleven gene expression datasets. Classification of observations for different sets of selected genes by the proposed method is carried out by three different classifiers, i.e., random forest, k-nearest neighbors (k-NN), and support vector machine (SVM). Box-plots and stability scores of the results are also shown in this paper. The results reveal that in most of the cases the proposed method outperforms the other methods.
topic Overlapping analysis
Feature selection
Binary classification
Functional genomic
url https://peerj.com/articles/cs-562.pdf
work_keys_str_mv AT muhammadhamraz robustproportionaloverlappinganalysisforfeatureselectioninbinaryclassificationwithinfunctionalgenomicexperiments
AT nazgul robustproportionaloverlappinganalysisforfeatureselectioninbinaryclassificationwithinfunctionalgenomicexperiments
AT mushtaqraza robustproportionaloverlappinganalysisforfeatureselectioninbinaryclassificationwithinfunctionalgenomicexperiments
AT dostmuhammadkhan robustproportionaloverlappinganalysisforfeatureselectioninbinaryclassificationwithinfunctionalgenomicexperiments
AT umairkhalil robustproportionaloverlappinganalysisforfeatureselectioninbinaryclassificationwithinfunctionalgenomicexperiments
AT seemazubair robustproportionaloverlappinganalysisforfeatureselectioninbinaryclassificationwithinfunctionalgenomicexperiments
AT zardadkhan robustproportionaloverlappinganalysisforfeatureselectioninbinaryclassificationwithinfunctionalgenomicexperiments
_version_ 1721399142761103360