A hybrid feature subset selection algorithm for analysis of high correlation proteomic data

Pathological changes within an organ can be reflected as proteomic patterns in biological fluids such as plasma, serum, and urine. The surface-enhanced laser desorption and ionization time-of-flight mass spectrometry (SELDI-TOF MS) has been used to generate proteomic profiles from biological fluids....

Full description

Bibliographic Details
Main Authors: Hussain Montazery Kordy, Mohammad Hossein Miran Baygi, Mohammad Hassan Moradi
Format: Article
Language:English
Published: Wolters Kluwer Medknow Publications 2012-01-01
Series:Journal of Medical Signals and Sensors
Subjects:
Online Access:http://www.jmss.mui.ac.ir/article.asp?issn=2228-7477;year=2012;volume=2;issue=3;spage=161;epage=168;aulast=Kordy
id doaj-b6048f37f850493381d541f7cb057dfb
record_format Article
spelling doaj-b6048f37f850493381d541f7cb057dfb2020-11-25T01:34:21ZengWolters Kluwer Medknow PublicationsJournal of Medical Signals and Sensors2228-74772012-01-0123161168A hybrid feature subset selection algorithm for analysis of high correlation proteomic dataHussain Montazery KordyMohammad Hossein Miran BaygiMohammad Hassan MoradiPathological changes within an organ can be reflected as proteomic patterns in biological fluids such as plasma, serum, and urine. The surface-enhanced laser desorption and ionization time-of-flight mass spectrometry (SELDI-TOF MS) has been used to generate proteomic profiles from biological fluids. Mass spectrometry yields redundant noisy data that the most data points are irrelevant features for differentiating between cancer and normal cases. In this paper, we have proposed a hybrid feature subset selection algorithm based on maximum-discrimination and minimum-correlation coupled with peak scoring criteria. Our algorithm has been applied to two independent SELDI-TOF MS datasets of ovarian cancer obtained from the NCI-FDA clinical proteomics databank. The proposed algorithm has used to extract a set of proteins as potential biomarkers in each dataset. We applied the linear discriminate analysis to identify the important biomarkers. The selected biomarkers have been able to successfully diagnose the ovarian cancer patients from the noncancer control group with an accuracy of 100%, a sensitivity of 100%, and a specificity of 100% in the two datasets. The hybrid algorithm has the advantage that increases reproducibility of selected biomarkers and able to find a small set of proteins with high discrimination power.http://www.jmss.mui.ac.ir/article.asp?issn=2228-7477;year=2012;volume=2;issue=3;spage=161;epage=168;aulast=KordyBiomarkerclassificationcorrelation-based weight functionfeature subset selectionpeak scoringproteomics
collection DOAJ
language English
format Article
sources DOAJ
author Hussain Montazery Kordy
Mohammad Hossein Miran Baygi
Mohammad Hassan Moradi
spellingShingle Hussain Montazery Kordy
Mohammad Hossein Miran Baygi
Mohammad Hassan Moradi
A hybrid feature subset selection algorithm for analysis of high correlation proteomic data
Journal of Medical Signals and Sensors
Biomarker
classification
correlation-based weight function
feature subset selection
peak scoring
proteomics
author_facet Hussain Montazery Kordy
Mohammad Hossein Miran Baygi
Mohammad Hassan Moradi
author_sort Hussain Montazery Kordy
title A hybrid feature subset selection algorithm for analysis of high correlation proteomic data
title_short A hybrid feature subset selection algorithm for analysis of high correlation proteomic data
title_full A hybrid feature subset selection algorithm for analysis of high correlation proteomic data
title_fullStr A hybrid feature subset selection algorithm for analysis of high correlation proteomic data
title_full_unstemmed A hybrid feature subset selection algorithm for analysis of high correlation proteomic data
title_sort hybrid feature subset selection algorithm for analysis of high correlation proteomic data
publisher Wolters Kluwer Medknow Publications
series Journal of Medical Signals and Sensors
issn 2228-7477
publishDate 2012-01-01
description Pathological changes within an organ can be reflected as proteomic patterns in biological fluids such as plasma, serum, and urine. The surface-enhanced laser desorption and ionization time-of-flight mass spectrometry (SELDI-TOF MS) has been used to generate proteomic profiles from biological fluids. Mass spectrometry yields redundant noisy data that the most data points are irrelevant features for differentiating between cancer and normal cases. In this paper, we have proposed a hybrid feature subset selection algorithm based on maximum-discrimination and minimum-correlation coupled with peak scoring criteria. Our algorithm has been applied to two independent SELDI-TOF MS datasets of ovarian cancer obtained from the NCI-FDA clinical proteomics databank. The proposed algorithm has used to extract a set of proteins as potential biomarkers in each dataset. We applied the linear discriminate analysis to identify the important biomarkers. The selected biomarkers have been able to successfully diagnose the ovarian cancer patients from the noncancer control group with an accuracy of 100%, a sensitivity of 100%, and a specificity of 100% in the two datasets. The hybrid algorithm has the advantage that increases reproducibility of selected biomarkers and able to find a small set of proteins with high discrimination power.
topic Biomarker
classification
correlation-based weight function
feature subset selection
peak scoring
proteomics
url http://www.jmss.mui.ac.ir/article.asp?issn=2228-7477;year=2012;volume=2;issue=3;spage=161;epage=168;aulast=Kordy
work_keys_str_mv AT hussainmontazerykordy ahybridfeaturesubsetselectionalgorithmforanalysisofhighcorrelationproteomicdata
AT mohammadhosseinmiranbaygi ahybridfeaturesubsetselectionalgorithmforanalysisofhighcorrelationproteomicdata
AT mohammadhassanmoradi ahybridfeaturesubsetselectionalgorithmforanalysisofhighcorrelationproteomicdata
AT hussainmontazerykordy hybridfeaturesubsetselectionalgorithmforanalysisofhighcorrelationproteomicdata
AT mohammadhosseinmiranbaygi hybridfeaturesubsetselectionalgorithmforanalysisofhighcorrelationproteomicdata
AT mohammadhassanmoradi hybridfeaturesubsetselectionalgorithmforanalysisofhighcorrelationproteomicdata
_version_ 1725072877054590976