Meta-analysis of gene expression microarrays with missing replicates

<p>Abstract</p> <p>Background</p> <p>Many different microarray experiments are publicly available today. It is natural to ask whether different experiments for the same phenotypic conditions can be combined using meta-analysis, in order to increase the overall sample si...

Full description

Bibliographic Details
Main Authors: Leckie Christopher, Abraham Gad, Shi Fan, Haviv Izhak, Kowalczyk Adam
Format: Article
Language:English
Published: BMC 2011-03-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/12/84
id doaj-11c3b20f8ec64284b5e6f69625300320
record_format Article
spelling doaj-11c3b20f8ec64284b5e6f696253003202020-11-24T21:42:57ZengBMCBMC Bioinformatics1471-21052011-03-011218410.1186/1471-2105-12-84Meta-analysis of gene expression microarrays with missing replicatesLeckie ChristopherAbraham GadShi FanHaviv IzhakKowalczyk Adam<p>Abstract</p> <p>Background</p> <p>Many different microarray experiments are publicly available today. It is natural to ask whether different experiments for the same phenotypic conditions can be combined using meta-analysis, in order to increase the overall sample size. However, some genes are not measured in all experiments, hence they cannot be included or their statistical significance cannot be appropriately estimated in traditional meta-analysis. Nonetheless, these genes, which we refer to as <it>incomplete genes</it>, may also be informative and useful.</p> <p>Results</p> <p>We propose a meta-analysis framework, called "Incomplete Gene Meta-analysis", which can include incomplete genes by imputing the significance of missing replicates, and computing a meta-score for every gene across all datasets. We demonstrate that the incomplete genes are worthy of being included and our method is able to appropriately estimate their significance in two groups of experiments. We first apply the <it>Incomplete Gene Meta-analysis </it>and several comparable methods to five breast cancer datasets with an identical set of probes. We simulate incomplete genes by randomly removing a subset of probes from each dataset and demonstrate that our method consistently outperforms two other methods in terms of their false discovery rate. We also apply the methods to three gastric cancer datasets for the purpose of discriminating diffuse and intestinal subtypes.</p> <p>Conclusions</p> <p>Meta-analysis is an effective approach that identifies more robust sets of differentially expressed genes from multiple studies. The incomplete genes that mainly arise from the use of different platforms may also have statistical and biological importance but are ignored or are not appropriately involved by previous studies. Our Incomplete Gene Meta-analysis is able to incorporate the incomplete genes by estimating their significance. The results on both breast and gastric cancer datasets suggest that the highly ranked genes and associated GO terms produced by our method are more significant and biologically meaningful according to the previous literature.</p> http://www.biomedcentral.com/1471-2105/12/84
collection DOAJ
language English
format Article
sources DOAJ
author Leckie Christopher
Abraham Gad
Shi Fan
Haviv Izhak
Kowalczyk Adam
spellingShingle Leckie Christopher
Abraham Gad
Shi Fan
Haviv Izhak
Kowalczyk Adam
Meta-analysis of gene expression microarrays with missing replicates
BMC Bioinformatics
author_facet Leckie Christopher
Abraham Gad
Shi Fan
Haviv Izhak
Kowalczyk Adam
author_sort Leckie Christopher
title Meta-analysis of gene expression microarrays with missing replicates
title_short Meta-analysis of gene expression microarrays with missing replicates
title_full Meta-analysis of gene expression microarrays with missing replicates
title_fullStr Meta-analysis of gene expression microarrays with missing replicates
title_full_unstemmed Meta-analysis of gene expression microarrays with missing replicates
title_sort meta-analysis of gene expression microarrays with missing replicates
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2011-03-01
description <p>Abstract</p> <p>Background</p> <p>Many different microarray experiments are publicly available today. It is natural to ask whether different experiments for the same phenotypic conditions can be combined using meta-analysis, in order to increase the overall sample size. However, some genes are not measured in all experiments, hence they cannot be included or their statistical significance cannot be appropriately estimated in traditional meta-analysis. Nonetheless, these genes, which we refer to as <it>incomplete genes</it>, may also be informative and useful.</p> <p>Results</p> <p>We propose a meta-analysis framework, called "Incomplete Gene Meta-analysis", which can include incomplete genes by imputing the significance of missing replicates, and computing a meta-score for every gene across all datasets. We demonstrate that the incomplete genes are worthy of being included and our method is able to appropriately estimate their significance in two groups of experiments. We first apply the <it>Incomplete Gene Meta-analysis </it>and several comparable methods to five breast cancer datasets with an identical set of probes. We simulate incomplete genes by randomly removing a subset of probes from each dataset and demonstrate that our method consistently outperforms two other methods in terms of their false discovery rate. We also apply the methods to three gastric cancer datasets for the purpose of discriminating diffuse and intestinal subtypes.</p> <p>Conclusions</p> <p>Meta-analysis is an effective approach that identifies more robust sets of differentially expressed genes from multiple studies. The incomplete genes that mainly arise from the use of different platforms may also have statistical and biological importance but are ignored or are not appropriately involved by previous studies. Our Incomplete Gene Meta-analysis is able to incorporate the incomplete genes by estimating their significance. The results on both breast and gastric cancer datasets suggest that the highly ranked genes and associated GO terms produced by our method are more significant and biologically meaningful according to the previous literature.</p>
url http://www.biomedcentral.com/1471-2105/12/84
work_keys_str_mv AT leckiechristopher metaanalysisofgeneexpressionmicroarrayswithmissingreplicates
AT abrahamgad metaanalysisofgeneexpressionmicroarrayswithmissingreplicates
AT shifan metaanalysisofgeneexpressionmicroarrayswithmissingreplicates
AT havivizhak metaanalysisofgeneexpressionmicroarrayswithmissingreplicates
AT kowalczykadam metaanalysisofgeneexpressionmicroarrayswithmissingreplicates
_version_ 1725916157060317184