A Practical Multifaceted Approach to Selecting Differentially Expressed Genes
Consider a gene expression array study comparing two groups of subjects where the goal is to explore a large number of genes in order to select for further investigation a subset that appear to be differently expressed. There has been much statistical research into the development of formal methods...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2007-01-01
|
Series: | Cancer Informatics |
Online Access: | https://doi.org/10.1177/117693510700300032 |
id |
doaj-dcdaa242ba28425ca01dccc75f2c62bc |
---|---|
record_format |
Article |
spelling |
doaj-dcdaa242ba28425ca01dccc75f2c62bc2020-11-25T03:20:53ZengSAGE PublishingCancer Informatics1176-93512007-01-01310.1177/117693510700300032A Practical Multifaceted Approach to Selecting Differentially Expressed GenesYingye Zheng0Margaret Pepe1 Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue N., M2-B232 Seattle, WA 98109.Department of Biostatistics, University of Washington, Box 357232 Seattle, WA 98195.Consider a gene expression array study comparing two groups of subjects where the goal is to explore a large number of genes in order to select for further investigation a subset that appear to be differently expressed. There has been much statistical research into the development of formal methods for designating genes as differentially expressed. These procedures control error rates such as the false detection rate or family wise error rate. We contend however that other statistical considerations are also relevant to the task of gene selection. These include the extent of differential expression and the strength of evidence for differential expression at a gene. Using real and simulated data we first demonstrate that a proper exploratory analysis should evaluate these aspects as well as decision rules that control error rates. We propose a new measure called the mp -value that quantifies strength of evidence for differential expression. The mp -values are calculated with a resampling based algorithm taking into account the multiplicity and dependence encountered in microarray data. In contrast to traditional p -values our mp -values do not depend on specification of a decision rule for their definition. They are simply descriptive in nature. We contrast the mp -values with multiple testing p -values in the context of data from a breast cancer prognosis study and from a simulation model.https://doi.org/10.1177/117693510700300032 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Yingye Zheng Margaret Pepe |
spellingShingle |
Yingye Zheng Margaret Pepe A Practical Multifaceted Approach to Selecting Differentially Expressed Genes Cancer Informatics |
author_facet |
Yingye Zheng Margaret Pepe |
author_sort |
Yingye Zheng |
title |
A Practical Multifaceted Approach to Selecting Differentially Expressed Genes |
title_short |
A Practical Multifaceted Approach to Selecting Differentially Expressed Genes |
title_full |
A Practical Multifaceted Approach to Selecting Differentially Expressed Genes |
title_fullStr |
A Practical Multifaceted Approach to Selecting Differentially Expressed Genes |
title_full_unstemmed |
A Practical Multifaceted Approach to Selecting Differentially Expressed Genes |
title_sort |
practical multifaceted approach to selecting differentially expressed genes |
publisher |
SAGE Publishing |
series |
Cancer Informatics |
issn |
1176-9351 |
publishDate |
2007-01-01 |
description |
Consider a gene expression array study comparing two groups of subjects where the goal is to explore a large number of genes in order to select for further investigation a subset that appear to be differently expressed. There has been much statistical research into the development of formal methods for designating genes as differentially expressed. These procedures control error rates such as the false detection rate or family wise error rate. We contend however that other statistical considerations are also relevant to the task of gene selection. These include the extent of differential expression and the strength of evidence for differential expression at a gene. Using real and simulated data we first demonstrate that a proper exploratory analysis should evaluate these aspects as well as decision rules that control error rates. We propose a new measure called the mp -value that quantifies strength of evidence for differential expression. The mp -values are calculated with a resampling based algorithm taking into account the multiplicity and dependence encountered in microarray data. In contrast to traditional p -values our mp -values do not depend on specification of a decision rule for their definition. They are simply descriptive in nature. We contrast the mp -values with multiple testing p -values in the context of data from a breast cancer prognosis study and from a simulation model. |
url |
https://doi.org/10.1177/117693510700300032 |
work_keys_str_mv |
AT yingyezheng apracticalmultifacetedapproachtoselectingdifferentiallyexpressedgenes AT margaretpepe apracticalmultifacetedapproachtoselectingdifferentiallyexpressedgenes AT yingyezheng practicalmultifacetedapproachtoselectingdifferentiallyexpressedgenes AT margaretpepe practicalmultifacetedapproachtoselectingdifferentiallyexpressedgenes |
_version_ |
1724616017158602752 |