A comparative review of estimates of the proportion unchanged genes and the false discovery rate

Abstract Background In the analysis of microarray data one generally produces a vector of <it>p</it>-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these <i...

Full description

Bibliographic Details
Main Author:	Broberg Per
Format:	Article
Language:	English
Published:	BMC 2005-08-01
Series:	BMC Bioinformatics
Online Access:	http://www.biomedcentral.com/1471-2105/6/199

id	doaj-18b08977546c47a29ac25343ad46ebb4
record_format	Article
spelling	doaj-18b08977546c47a29ac25343ad46ebb42020-11-25T02:47:36ZengBMCBMC Bioinformatics1471-21052005-08-016119910.1186/1471-2105-6-199A comparative review of estimates of the proportion unchanged genes and the false discovery rateBroberg Per<p>Abstract</p> <p>Background</p> <p>In the analysis of microarray data one generally produces a vector of <it>p</it>-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these <it>p</it>-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR) and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the <it>p</it>-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available.</p> <p>Results</p> <p>A simulation model based on the distribution of real microarray data plus two real data sets were used to assess the methods. The proposed alternative methods for estimating the proportion unchanged fared very well, and gave evidence of low bias and very low variance. Different methods perform well depending upon whether there are few or many regulated genes. Furthermore, the methods for estimating FDR showed a varying performance, and were sometimes misleading. The new method had a very low error.</p> <p>Conclusion</p> <p>The concept of the q-value or false discovery rate is useful in practical research, despite some theoretical and practical shortcomings. However, it seems possible to challenge the performance of the published methods, and there is likely scope for further developing the estimates of the FDR. The new methods provide the scientist with more options to choose a suitable method for any particular experiment. The article advocates the use of the conjoint information regarding false positive and negative rates as well as the proportion unchanged when identifying changed genes.</p> http://www.biomedcentral.com/1471-2105/6/199
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Broberg Per
spellingShingle	Broberg Per A comparative review of estimates of the proportion unchanged genes and the false discovery rate BMC Bioinformatics
author_facet	Broberg Per
author_sort	Broberg Per
title	A comparative review of estimates of the proportion unchanged genes and the false discovery rate
title_short	A comparative review of estimates of the proportion unchanged genes and the false discovery rate
title_full	A comparative review of estimates of the proportion unchanged genes and the false discovery rate
title_fullStr	A comparative review of estimates of the proportion unchanged genes and the false discovery rate
title_full_unstemmed	A comparative review of estimates of the proportion unchanged genes and the false discovery rate
title_sort	comparative review of estimates of the proportion unchanged genes and the false discovery rate
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2005-08-01
description	<p>Abstract</p> <p>Background</p> <p>In the analysis of microarray data one generally produces a vector of <it>p</it>-values that for each gene give the likelihood of obtaining equally strong evidence of change by pure chance. The distribution of these <it>p</it>-values is a mixture of two components corresponding to the changed genes and the unchanged ones. The focus of this article is how to estimate the proportion unchanged and the false discovery rate (FDR) and how to make inferences based on these concepts. Six published methods for estimating the proportion unchanged genes are reviewed, two alternatives are presented, and all are tested on both simulated and real data. All estimates but one make do without any parametric assumptions concerning the distributions of the <it>p</it>-values. Furthermore, the estimation and use of the FDR and the closely related q-value is illustrated with examples. Five published estimates of the FDR and one new are presented and tested. Implementations in R code are available.</p> <p>Results</p> <p>A simulation model based on the distribution of real microarray data plus two real data sets were used to assess the methods. The proposed alternative methods for estimating the proportion unchanged fared very well, and gave evidence of low bias and very low variance. Different methods perform well depending upon whether there are few or many regulated genes. Furthermore, the methods for estimating FDR showed a varying performance, and were sometimes misleading. The new method had a very low error.</p> <p>Conclusion</p> <p>The concept of the q-value or false discovery rate is useful in practical research, despite some theoretical and practical shortcomings. However, it seems possible to challenge the performance of the published methods, and there is likely scope for further developing the estimates of the FDR. The new methods provide the scientist with more options to choose a suitable method for any particular experiment. The article advocates the use of the conjoint information regarding false positive and negative rates as well as the proportion unchanged when identifying changed genes.</p>
url	http://www.biomedcentral.com/1471-2105/6/199
work_keys_str_mv	AT brobergper acomparativereviewofestimatesoftheproportionunchangedgenesandthefalsediscoveryrate AT brobergper comparativereviewofestimatesoftheproportionunchangedgenesandthefalsediscoveryrate
_version_	1724752675261644800

A comparative review of estimates of the proportion unchanged genes and the false discovery rate

Similar Items