Meta-analysis of gene expression microarrays with missing replicates
<p>Abstract</p> <p>Background</p> <p>Many different microarray experiments are publicly available today. It is natural to ask whether different experiments for the same phenotypic conditions can be combined using meta-analysis, in order to increase the overall sample si...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2011-03-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/12/84 |
id |
doaj-11c3b20f8ec64284b5e6f69625300320 |
---|---|
record_format |
Article |
spelling |
doaj-11c3b20f8ec64284b5e6f696253003202020-11-24T21:42:57ZengBMCBMC Bioinformatics1471-21052011-03-011218410.1186/1471-2105-12-84Meta-analysis of gene expression microarrays with missing replicatesLeckie ChristopherAbraham GadShi FanHaviv IzhakKowalczyk Adam<p>Abstract</p> <p>Background</p> <p>Many different microarray experiments are publicly available today. It is natural to ask whether different experiments for the same phenotypic conditions can be combined using meta-analysis, in order to increase the overall sample size. However, some genes are not measured in all experiments, hence they cannot be included or their statistical significance cannot be appropriately estimated in traditional meta-analysis. Nonetheless, these genes, which we refer to as <it>incomplete genes</it>, may also be informative and useful.</p> <p>Results</p> <p>We propose a meta-analysis framework, called "Incomplete Gene Meta-analysis", which can include incomplete genes by imputing the significance of missing replicates, and computing a meta-score for every gene across all datasets. We demonstrate that the incomplete genes are worthy of being included and our method is able to appropriately estimate their significance in two groups of experiments. We first apply the <it>Incomplete Gene Meta-analysis </it>and several comparable methods to five breast cancer datasets with an identical set of probes. We simulate incomplete genes by randomly removing a subset of probes from each dataset and demonstrate that our method consistently outperforms two other methods in terms of their false discovery rate. We also apply the methods to three gastric cancer datasets for the purpose of discriminating diffuse and intestinal subtypes.</p> <p>Conclusions</p> <p>Meta-analysis is an effective approach that identifies more robust sets of differentially expressed genes from multiple studies. The incomplete genes that mainly arise from the use of different platforms may also have statistical and biological importance but are ignored or are not appropriately involved by previous studies. Our Incomplete Gene Meta-analysis is able to incorporate the incomplete genes by estimating their significance. The results on both breast and gastric cancer datasets suggest that the highly ranked genes and associated GO terms produced by our method are more significant and biologically meaningful according to the previous literature.</p> http://www.biomedcentral.com/1471-2105/12/84 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Leckie Christopher Abraham Gad Shi Fan Haviv Izhak Kowalczyk Adam |
spellingShingle |
Leckie Christopher Abraham Gad Shi Fan Haviv Izhak Kowalczyk Adam Meta-analysis of gene expression microarrays with missing replicates BMC Bioinformatics |
author_facet |
Leckie Christopher Abraham Gad Shi Fan Haviv Izhak Kowalczyk Adam |
author_sort |
Leckie Christopher |
title |
Meta-analysis of gene expression microarrays with missing replicates |
title_short |
Meta-analysis of gene expression microarrays with missing replicates |
title_full |
Meta-analysis of gene expression microarrays with missing replicates |
title_fullStr |
Meta-analysis of gene expression microarrays with missing replicates |
title_full_unstemmed |
Meta-analysis of gene expression microarrays with missing replicates |
title_sort |
meta-analysis of gene expression microarrays with missing replicates |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2011-03-01 |
description |
<p>Abstract</p> <p>Background</p> <p>Many different microarray experiments are publicly available today. It is natural to ask whether different experiments for the same phenotypic conditions can be combined using meta-analysis, in order to increase the overall sample size. However, some genes are not measured in all experiments, hence they cannot be included or their statistical significance cannot be appropriately estimated in traditional meta-analysis. Nonetheless, these genes, which we refer to as <it>incomplete genes</it>, may also be informative and useful.</p> <p>Results</p> <p>We propose a meta-analysis framework, called "Incomplete Gene Meta-analysis", which can include incomplete genes by imputing the significance of missing replicates, and computing a meta-score for every gene across all datasets. We demonstrate that the incomplete genes are worthy of being included and our method is able to appropriately estimate their significance in two groups of experiments. We first apply the <it>Incomplete Gene Meta-analysis </it>and several comparable methods to five breast cancer datasets with an identical set of probes. We simulate incomplete genes by randomly removing a subset of probes from each dataset and demonstrate that our method consistently outperforms two other methods in terms of their false discovery rate. We also apply the methods to three gastric cancer datasets for the purpose of discriminating diffuse and intestinal subtypes.</p> <p>Conclusions</p> <p>Meta-analysis is an effective approach that identifies more robust sets of differentially expressed genes from multiple studies. The incomplete genes that mainly arise from the use of different platforms may also have statistical and biological importance but are ignored or are not appropriately involved by previous studies. Our Incomplete Gene Meta-analysis is able to incorporate the incomplete genes by estimating their significance. The results on both breast and gastric cancer datasets suggest that the highly ranked genes and associated GO terms produced by our method are more significant and biologically meaningful according to the previous literature.</p> |
url |
http://www.biomedcentral.com/1471-2105/12/84 |
work_keys_str_mv |
AT leckiechristopher metaanalysisofgeneexpressionmicroarrayswithmissingreplicates AT abrahamgad metaanalysisofgeneexpressionmicroarrayswithmissingreplicates AT shifan metaanalysisofgeneexpressionmicroarrayswithmissingreplicates AT havivizhak metaanalysisofgeneexpressionmicroarrayswithmissingreplicates AT kowalczykadam metaanalysisofgeneexpressionmicroarrayswithmissingreplicates |
_version_ |
1725916157060317184 |