Quantifying stability in gene list ranking across microarray derived clinical biomarkers

<p>Abstract</p> <p>Background</p> <p>Identifying stable gene lists for diagnosis, prognosis prediction, and treatment guidance of tumors remains a major challenge in cancer research. Microarrays measuring differential gene expression are widely used and should be versat...

Full description

Bibliographic Details
Main Authors: Arden Nilou S, Schneckener Sebastian, Schuppert Andreas
Format: Article
Language:English
Published: BMC 2011-10-01
Series:BMC Medical Genomics
Online Access:http://www.biomedcentral.com/1755-8794/4/73
id doaj-74eaed1a37a84b91ae97601230767c2f
record_format Article
spelling doaj-74eaed1a37a84b91ae97601230767c2f2021-04-02T04:26:10ZengBMCBMC Medical Genomics1755-87942011-10-01417310.1186/1755-8794-4-73Quantifying stability in gene list ranking across microarray derived clinical biomarkersArden Nilou SSchneckener SebastianSchuppert Andreas<p>Abstract</p> <p>Background</p> <p>Identifying stable gene lists for diagnosis, prognosis prediction, and treatment guidance of tumors remains a major challenge in cancer research. Microarrays measuring differential gene expression are widely used and should be versatile predictors of disease and other phenotypic data. However, gene expression profile studies and predictive biomarkers are often of low power, requiring numerous samples for a sound statistic, or vary between studies. Given the inconsistency of results across similar studies, methods that identify robust biomarkers from microarray data are needed to relay true biological information. Here we present a method to demonstrate that gene list stability and predictive power depends not only on the size of studies, but also on the clinical phenotype.</p> <p>Results</p> <p>Our method projects genomic tumor expression data to a lower dimensional space representing the main variation in the data. Some information regarding the phenotype resides in this low dimensional space, while some information resides in the residuum. We then introduce an information ratio (IR) as a metric defined by the partition between projected and residual space. Upon grouping phenotypes such as tumor tissue, histological grades, relapse, or aging, we show that higher IR values correlated with phenotypes that yield less robust biomarkers whereas lower IR values showed higher transferability across studies. Our results indicate that the IR is correlated with predictive accuracy. When tested across different published datasets, the IR can identify information-rich data characterizing clinical phenotypes and stable biomarkers.</p> <p>Conclusions</p> <p>The IR presents a quantitative metric to estimate the information content of gene expression data with respect to particular phenotypes.</p> http://www.biomedcentral.com/1755-8794/4/73
collection DOAJ
language English
format Article
sources DOAJ
author Arden Nilou S
Schneckener Sebastian
Schuppert Andreas
spellingShingle Arden Nilou S
Schneckener Sebastian
Schuppert Andreas
Quantifying stability in gene list ranking across microarray derived clinical biomarkers
BMC Medical Genomics
author_facet Arden Nilou S
Schneckener Sebastian
Schuppert Andreas
author_sort Arden Nilou S
title Quantifying stability in gene list ranking across microarray derived clinical biomarkers
title_short Quantifying stability in gene list ranking across microarray derived clinical biomarkers
title_full Quantifying stability in gene list ranking across microarray derived clinical biomarkers
title_fullStr Quantifying stability in gene list ranking across microarray derived clinical biomarkers
title_full_unstemmed Quantifying stability in gene list ranking across microarray derived clinical biomarkers
title_sort quantifying stability in gene list ranking across microarray derived clinical biomarkers
publisher BMC
series BMC Medical Genomics
issn 1755-8794
publishDate 2011-10-01
description <p>Abstract</p> <p>Background</p> <p>Identifying stable gene lists for diagnosis, prognosis prediction, and treatment guidance of tumors remains a major challenge in cancer research. Microarrays measuring differential gene expression are widely used and should be versatile predictors of disease and other phenotypic data. However, gene expression profile studies and predictive biomarkers are often of low power, requiring numerous samples for a sound statistic, or vary between studies. Given the inconsistency of results across similar studies, methods that identify robust biomarkers from microarray data are needed to relay true biological information. Here we present a method to demonstrate that gene list stability and predictive power depends not only on the size of studies, but also on the clinical phenotype.</p> <p>Results</p> <p>Our method projects genomic tumor expression data to a lower dimensional space representing the main variation in the data. Some information regarding the phenotype resides in this low dimensional space, while some information resides in the residuum. We then introduce an information ratio (IR) as a metric defined by the partition between projected and residual space. Upon grouping phenotypes such as tumor tissue, histological grades, relapse, or aging, we show that higher IR values correlated with phenotypes that yield less robust biomarkers whereas lower IR values showed higher transferability across studies. Our results indicate that the IR is correlated with predictive accuracy. When tested across different published datasets, the IR can identify information-rich data characterizing clinical phenotypes and stable biomarkers.</p> <p>Conclusions</p> <p>The IR presents a quantitative metric to estimate the information content of gene expression data with respect to particular phenotypes.</p>
url http://www.biomedcentral.com/1755-8794/4/73
work_keys_str_mv AT ardennilous quantifyingstabilityingenelistrankingacrossmicroarrayderivedclinicalbiomarkers
AT schneckenersebastian quantifyingstabilityingenelistrankingacrossmicroarrayderivedclinicalbiomarkers
AT schuppertandreas quantifyingstabilityingenelistrankingacrossmicroarrayderivedclinicalbiomarkers
_version_ 1724173264357425152