A Practical Multifaceted Approach to Selecting Differentially Expressed Genes

Consider a gene expression array study comparing two groups of subjects where the goal is to explore a large number of genes in order to select for further investigation a subset that appear to be differently expressed. There has been much statistical research into the development of formal methods...

Full description

Bibliographic Details
Main Authors: Yingye Zheng, Margaret Pepe
Format: Article
Language:English
Published: SAGE Publishing 2007-01-01
Series:Cancer Informatics
Online Access:https://doi.org/10.1177/117693510700300032
id doaj-dcdaa242ba28425ca01dccc75f2c62bc
record_format Article
spelling doaj-dcdaa242ba28425ca01dccc75f2c62bc2020-11-25T03:20:53ZengSAGE PublishingCancer Informatics1176-93512007-01-01310.1177/117693510700300032A Practical Multifaceted Approach to Selecting Differentially Expressed GenesYingye Zheng0Margaret Pepe1 Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue N., M2-B232 Seattle, WA 98109.Department of Biostatistics, University of Washington, Box 357232 Seattle, WA 98195.Consider a gene expression array study comparing two groups of subjects where the goal is to explore a large number of genes in order to select for further investigation a subset that appear to be differently expressed. There has been much statistical research into the development of formal methods for designating genes as differentially expressed. These procedures control error rates such as the false detection rate or family wise error rate. We contend however that other statistical considerations are also relevant to the task of gene selection. These include the extent of differential expression and the strength of evidence for differential expression at a gene. Using real and simulated data we first demonstrate that a proper exploratory analysis should evaluate these aspects as well as decision rules that control error rates. We propose a new measure called the mp -value that quantifies strength of evidence for differential expression. The mp -values are calculated with a resampling based algorithm taking into account the multiplicity and dependence encountered in microarray data. In contrast to traditional p -values our mp -values do not depend on specification of a decision rule for their definition. They are simply descriptive in nature. We contrast the mp -values with multiple testing p -values in the context of data from a breast cancer prognosis study and from a simulation model.https://doi.org/10.1177/117693510700300032
collection DOAJ
language English
format Article
sources DOAJ
author Yingye Zheng
Margaret Pepe
spellingShingle Yingye Zheng
Margaret Pepe
A Practical Multifaceted Approach to Selecting Differentially Expressed Genes
Cancer Informatics
author_facet Yingye Zheng
Margaret Pepe
author_sort Yingye Zheng
title A Practical Multifaceted Approach to Selecting Differentially Expressed Genes
title_short A Practical Multifaceted Approach to Selecting Differentially Expressed Genes
title_full A Practical Multifaceted Approach to Selecting Differentially Expressed Genes
title_fullStr A Practical Multifaceted Approach to Selecting Differentially Expressed Genes
title_full_unstemmed A Practical Multifaceted Approach to Selecting Differentially Expressed Genes
title_sort practical multifaceted approach to selecting differentially expressed genes
publisher SAGE Publishing
series Cancer Informatics
issn 1176-9351
publishDate 2007-01-01
description Consider a gene expression array study comparing two groups of subjects where the goal is to explore a large number of genes in order to select for further investigation a subset that appear to be differently expressed. There has been much statistical research into the development of formal methods for designating genes as differentially expressed. These procedures control error rates such as the false detection rate or family wise error rate. We contend however that other statistical considerations are also relevant to the task of gene selection. These include the extent of differential expression and the strength of evidence for differential expression at a gene. Using real and simulated data we first demonstrate that a proper exploratory analysis should evaluate these aspects as well as decision rules that control error rates. We propose a new measure called the mp -value that quantifies strength of evidence for differential expression. The mp -values are calculated with a resampling based algorithm taking into account the multiplicity and dependence encountered in microarray data. In contrast to traditional p -values our mp -values do not depend on specification of a decision rule for their definition. They are simply descriptive in nature. We contrast the mp -values with multiple testing p -values in the context of data from a breast cancer prognosis study and from a simulation model.
url https://doi.org/10.1177/117693510700300032
work_keys_str_mv AT yingyezheng apracticalmultifacetedapproachtoselectingdifferentiallyexpressedgenes
AT margaretpepe apracticalmultifacetedapproachtoselectingdifferentiallyexpressedgenes
AT yingyezheng practicalmultifacetedapproachtoselectingdifferentiallyexpressedgenes
AT margaretpepe practicalmultifacetedapproachtoselectingdifferentiallyexpressedgenes
_version_ 1724616017158602752