A full Bayesian hierarchical mixture model for the variance of gene differential expression

Abstract Background In many laboratory-based high throughput microarray experiments, there are very few replicates of gene expression levels. Thus, estimates of gene variances are inaccurate. Visual inspection of graphical summaries of these data usuall...

Full description

Bibliographic Details
Main Authors:	Walls Rebecca E, Manda Samuel OM, Gilthorpe Mark S
Format:	Article
Language:	English
Published:	BMC 2007-04-01
Series:	BMC Bioinformatics
Online Access:	http://www.biomedcentral.com/1471-2105/8/124

id	doaj-43cc076c6c204a52a74a5e17ab3efb08
record_format	Article
spelling	doaj-43cc076c6c204a52a74a5e17ab3efb082020-11-24T20:55:00ZengBMCBMC Bioinformatics1471-21052007-04-018112410.1186/1471-2105-8-124A full Bayesian hierarchical mixture model for the variance of gene differential expressionWalls Rebecca EManda Samuel OMGilthorpe Mark S<p>Abstract</p> <p>Background</p> <p>In many laboratory-based high throughput microarray experiments, there are very few replicates of gene expression levels. Thus, estimates of gene variances are inaccurate. Visual inspection of graphical summaries of these data usually reveals that heteroscedasticity is present, and the standard approach to address this is to take a log<sub>2 </sub>transformation. In such circumstances, it is then common to assume that gene variability is constant when an analysis of these data is undertaken. However, this is perhaps too stringent an assumption. More careful inspection reveals that the simple log<sub>2 </sub>transformation does not remove the problem of heteroscedasticity. An alternative strategy is to assume independent gene-specific variances; although again this is problematic as variance estimates based on few replications are highly unstable. More meaningful and reliable comparisons of gene expression might be achieved, for different conditions or different tissue samples, where the test statistics are based on accurate estimates of gene variability; a crucial step in the identification of differentially expressed genes.</p> <p>Results</p> <p>We propose a Bayesian mixture model, which classifies genes according to similarity in their variance. The result is that genes in the same latent class share the similar variance, estimated from a larger number of replicates than purely those per gene, i.e. the total of all replicates of all genes in the same latent class. An example dataset, consisting of 9216 genes with four replicates per condition, resulted in four latent classes based on their similarity of the variance.</p> <p>Conclusion</p> <p>The mixture variance model provides a realistic and flexible estimate for the variance of gene expression data under limited replicates. We believe that in using the latent class variances, estimated from a larger number of genes in each derived latent group, the <it>p</it>-values obtained are more robust than either using a constant gene or gene-specific variance estimate.</p> http://www.biomedcentral.com/1471-2105/8/124
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Walls Rebecca E Manda Samuel OM Gilthorpe Mark S
spellingShingle	Walls Rebecca E Manda Samuel OM Gilthorpe Mark S A full Bayesian hierarchical mixture model for the variance of gene differential expression BMC Bioinformatics
author_facet	Walls Rebecca E Manda Samuel OM Gilthorpe Mark S
author_sort	Walls Rebecca E
title	A full Bayesian hierarchical mixture model for the variance of gene differential expression
title_short	A full Bayesian hierarchical mixture model for the variance of gene differential expression
title_full	A full Bayesian hierarchical mixture model for the variance of gene differential expression
title_fullStr	A full Bayesian hierarchical mixture model for the variance of gene differential expression
title_full_unstemmed	A full Bayesian hierarchical mixture model for the variance of gene differential expression
title_sort	full bayesian hierarchical mixture model for the variance of gene differential expression
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2007-04-01
description	<p>Abstract</p> <p>Background</p> <p>In many laboratory-based high throughput microarray experiments, there are very few replicates of gene expression levels. Thus, estimates of gene variances are inaccurate. Visual inspection of graphical summaries of these data usually reveals that heteroscedasticity is present, and the standard approach to address this is to take a log<sub>2 </sub>transformation. In such circumstances, it is then common to assume that gene variability is constant when an analysis of these data is undertaken. However, this is perhaps too stringent an assumption. More careful inspection reveals that the simple log<sub>2 </sub>transformation does not remove the problem of heteroscedasticity. An alternative strategy is to assume independent gene-specific variances; although again this is problematic as variance estimates based on few replications are highly unstable. More meaningful and reliable comparisons of gene expression might be achieved, for different conditions or different tissue samples, where the test statistics are based on accurate estimates of gene variability; a crucial step in the identification of differentially expressed genes.</p> <p>Results</p> <p>We propose a Bayesian mixture model, which classifies genes according to similarity in their variance. The result is that genes in the same latent class share the similar variance, estimated from a larger number of replicates than purely those per gene, i.e. the total of all replicates of all genes in the same latent class. An example dataset, consisting of 9216 genes with four replicates per condition, resulted in four latent classes based on their similarity of the variance.</p> <p>Conclusion</p> <p>The mixture variance model provides a realistic and flexible estimate for the variance of gene expression data under limited replicates. We believe that in using the latent class variances, estimated from a larger number of genes in each derived latent group, the <it>p</it>-values obtained are more robust than either using a constant gene or gene-specific variance estimate.</p>
url	http://www.biomedcentral.com/1471-2105/8/124
work_keys_str_mv	AT wallsrebeccae afullbayesianhierarchicalmixturemodelforthevarianceofgenedifferentialexpression AT mandasamuelom afullbayesianhierarchicalmixturemodelforthevarianceofgenedifferentialexpression AT gilthorpemarks afullbayesianhierarchicalmixturemodelforthevarianceofgenedifferentialexpression AT wallsrebeccae fullbayesianhierarchicalmixturemodelforthevarianceofgenedifferentialexpression AT mandasamuelom fullbayesianhierarchicalmixturemodelforthevarianceofgenedifferentialexpression AT gilthorpemarks fullbayesianhierarchicalmixturemodelforthevarianceofgenedifferentialexpression
_version_	1716792986815168512

A full Bayesian hierarchical mixture model for the variance of gene differential expression

Similar Items