Regularized binormal ROC method in disease classification using microarray data

<p>Abstract</p> <p>Background</p> <p>An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease diagnosis and prognosis. Thus it is of interest to develop efficient statistical methods that can simult...

Full description

Bibliographic Details
Main Authors: Huang Jian, Song Xiao, Ma Shuangge
Format: Article
Language:English
Published: BMC 2006-05-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/7/253
id doaj-773deeb240c14b03b0e1033b16279d1d
record_format Article
spelling doaj-773deeb240c14b03b0e1033b16279d1d2020-11-25T00:13:43ZengBMCBMC Bioinformatics1471-21052006-05-017125310.1186/1471-2105-7-253Regularized binormal ROC method in disease classification using microarray dataHuang JianSong XiaoMa Shuangge<p>Abstract</p> <p>Background</p> <p>An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease diagnosis and prognosis. Thus it is of interest to develop efficient statistical methods that can simultaneously identify important biomarkers from such high-throughput genomic data and construct appropriate classification rules. It is also of interest to develop methods for evaluation of classification performance and ranking of identified biomarkers.</p> <p>Results</p> <p>The ROC (receiver operating characteristic) technique has been widely used in disease classification with low dimensional biomarkers. Compared with the empirical ROC approach, the binormal ROC is computationally more affordable and robust in small sample size cases. We propose using the binormal AUC (area under the ROC curve) as the objective function for two-sample classification, and the scaled threshold gradient directed regularization method for regularized estimation and biomarker selection. Tuning parameter selection is based on <it>V</it>-fold cross validation. We develop Monte Carlo based methods for evaluating the stability of individual biomarkers and overall prediction performance. Extensive simulation studies show that the proposed approach can generate parsimonious models with excellent classification and prediction performance, under most simulated scenarios including model mis-specification. Application of the method to two cancer studies shows that the identified genes are reasonably stable with satisfactory prediction performance and biologically sound implications. The overall classification performance is satisfactory, with small classification errors and large AUCs.</p> <p>Conclusion</p> <p>In comparison to existing methods, the proposed approach is computationally more affordable without losing the optimality possessed by the standard ROC method.</p> http://www.biomedcentral.com/1471-2105/7/253
collection DOAJ
language English
format Article
sources DOAJ
author Huang Jian
Song Xiao
Ma Shuangge
spellingShingle Huang Jian
Song Xiao
Ma Shuangge
Regularized binormal ROC method in disease classification using microarray data
BMC Bioinformatics
author_facet Huang Jian
Song Xiao
Ma Shuangge
author_sort Huang Jian
title Regularized binormal ROC method in disease classification using microarray data
title_short Regularized binormal ROC method in disease classification using microarray data
title_full Regularized binormal ROC method in disease classification using microarray data
title_fullStr Regularized binormal ROC method in disease classification using microarray data
title_full_unstemmed Regularized binormal ROC method in disease classification using microarray data
title_sort regularized binormal roc method in disease classification using microarray data
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2006-05-01
description <p>Abstract</p> <p>Background</p> <p>An important application of microarrays is to discover genomic biomarkers, among tens of thousands of genes assayed, for disease diagnosis and prognosis. Thus it is of interest to develop efficient statistical methods that can simultaneously identify important biomarkers from such high-throughput genomic data and construct appropriate classification rules. It is also of interest to develop methods for evaluation of classification performance and ranking of identified biomarkers.</p> <p>Results</p> <p>The ROC (receiver operating characteristic) technique has been widely used in disease classification with low dimensional biomarkers. Compared with the empirical ROC approach, the binormal ROC is computationally more affordable and robust in small sample size cases. We propose using the binormal AUC (area under the ROC curve) as the objective function for two-sample classification, and the scaled threshold gradient directed regularization method for regularized estimation and biomarker selection. Tuning parameter selection is based on <it>V</it>-fold cross validation. We develop Monte Carlo based methods for evaluating the stability of individual biomarkers and overall prediction performance. Extensive simulation studies show that the proposed approach can generate parsimonious models with excellent classification and prediction performance, under most simulated scenarios including model mis-specification. Application of the method to two cancer studies shows that the identified genes are reasonably stable with satisfactory prediction performance and biologically sound implications. The overall classification performance is satisfactory, with small classification errors and large AUCs.</p> <p>Conclusion</p> <p>In comparison to existing methods, the proposed approach is computationally more affordable without losing the optimality possessed by the standard ROC method.</p>
url http://www.biomedcentral.com/1471-2105/7/253
work_keys_str_mv AT huangjian regularizedbinormalrocmethodindiseaseclassificationusingmicroarraydata
AT songxiao regularizedbinormalrocmethodindiseaseclassificationusingmicroarraydata
AT mashuangge regularizedbinormalrocmethodindiseaseclassificationusingmicroarraydata
_version_ 1725393500455829504