Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient

Abstract Background Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correl...

Full description

Bibliographic Details
Main Authors:	Loraine Ann, Hung Yeung, Salmi Mari L, Chang Chunqi, Yao Jianchao, Roux Stanley J
Format:	Article
Language:	English
Published:	BMC 2008-06-01
Series:	BMC Bioinformatics
Online Access:	http://www.biomedcentral.com/1471-2105/9/288

id	doaj-e1f0517b935d41ca86d930a103886b50
record_format	Article
spelling	doaj-e1f0517b935d41ca86d930a103886b502020-11-25T00:15:11ZengBMCBMC Bioinformatics1471-21052008-06-019128810.1186/1471-2105-9-288Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficientLoraine AnnHung YeungSalmi Mari LChang ChunqiYao JianchaoRoux Stanley J<p>Abstract</p> <p>Background</p> <p>Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data.</p> <p>Results</p> <p>In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from <it>Saccharomyces cerevisiae</it>. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern <it>Ceratopteris richardii</it>, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns.</p> <p>Conclusion</p> <p>This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.</p> http://www.biomedcentral.com/1471-2105/9/288
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Loraine Ann Hung Yeung Salmi Mari L Chang Chunqi Yao Jianchao Roux Stanley J
spellingShingle	Loraine Ann Hung Yeung Salmi Mari L Chang Chunqi Yao Jianchao Roux Stanley J Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient BMC Bioinformatics
author_facet	Loraine Ann Hung Yeung Salmi Mari L Chang Chunqi Yao Jianchao Roux Stanley J
author_sort	Loraine Ann
title	Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient
title_short	Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient
title_full	Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient
title_fullStr	Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient
title_full_unstemmed	Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient
title_sort	genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2008-06-01
description	<p>Abstract</p> <p>Background</p> <p>Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data.</p> <p>Results</p> <p>In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from <it>Saccharomyces cerevisiae</it>. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern <it>Ceratopteris richardii</it>, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns.</p> <p>Conclusion</p> <p>This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.</p>
url	http://www.biomedcentral.com/1471-2105/9/288
work_keys_str_mv	AT loraineann genomescaleclusteranalysisofreplicatedmicroarraysusingshrinkagecorrelationcoefficient AT hungyeung genomescaleclusteranalysisofreplicatedmicroarraysusingshrinkagecorrelationcoefficient AT salmimaril genomescaleclusteranalysisofreplicatedmicroarraysusingshrinkagecorrelationcoefficient AT changchunqi genomescaleclusteranalysisofreplicatedmicroarraysusingshrinkagecorrelationcoefficient AT yaojianchao genomescaleclusteranalysisofreplicatedmicroarraysusingshrinkagecorrelationcoefficient AT rouxstanleyj genomescaleclusteranalysisofreplicatedmicroarraysusingshrinkagecorrelationcoefficient
_version_	1725388220707897344

Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient

Similar Items