Integrating multiple genome annotation databases improves the interpretation of microarray gene expression data
<p>Abstract</p> <p>Background</p> <p>The Affymetrix GeneChip is a widely used gene expression profiling platform. Since the chips were originally designed, the genome databases and gene definitions have been considerably updated. Thus, more accurate interpretation of mi...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2010-01-01
|
Series: | BMC Genomics |
Online Access: | http://www.biomedcentral.com/1471-2164/11/50 |
id |
doaj-a9e64c08e4c6487fac8a11f7466f20b3 |
---|---|
record_format |
Article |
spelling |
doaj-a9e64c08e4c6487fac8a11f7466f20b32020-11-24T23:58:46ZengBMCBMC Genomics1471-21642010-01-011115010.1186/1471-2164-11-50Integrating multiple genome annotation databases improves the interpretation of microarray gene expression dataKennedy BreandanGlaviano AntoninoJeffery Ian BMcLoughlin SarahYin JunHiggins Desmond G<p>Abstract</p> <p>Background</p> <p>The Affymetrix GeneChip is a widely used gene expression profiling platform. Since the chips were originally designed, the genome databases and gene definitions have been considerably updated. Thus, more accurate interpretation of microarray data requires parallel updating of the specificity of GeneChip probes. We propose a new probe remapping protocol, using the zebrafish GeneChips as an example, by removing nonspecific probes, and grouping the probes into transcript level probe sets using an integrated zebrafish genome annotation. This genome annotation is based on combining transcript information from multiple databases. This new remapping protocol, especially the new genome annotation, is shown here to be an important factor in improving the interpretation of gene expression microarray data.</p> <p>Results</p> <p>Transcript data from the RefSeq, GenBank and Ensembl databases were downloaded from the UCSC genome browser, and integrated to generate a combined zebrafish genome annotation. Affymetrix probes were filtered and remapped according to the new annotation. The influence of transcript collection and gene definition methods was tested using two microarray data sets. Compared to remapping using a single database, this new remapping protocol results in up to 20% more probes being retained in the remapping, leading to approximately 1,000 more genes being detected. The differentially expressed gene lists are consequently increased by up to 30%. We are also able to detect up to three times more alternative splicing events. A small number of the bioinformatics predictions were confirmed using real-time PCR validation.</p> <p>Conclusions</p> <p>By combining gene definitions from multiple databases, it is possible to greatly increase the numbers of genes and splice variants that can be detected in microarray gene expression experiments.</p> http://www.biomedcentral.com/1471-2164/11/50 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Kennedy Breandan Glaviano Antonino Jeffery Ian B McLoughlin Sarah Yin Jun Higgins Desmond G |
spellingShingle |
Kennedy Breandan Glaviano Antonino Jeffery Ian B McLoughlin Sarah Yin Jun Higgins Desmond G Integrating multiple genome annotation databases improves the interpretation of microarray gene expression data BMC Genomics |
author_facet |
Kennedy Breandan Glaviano Antonino Jeffery Ian B McLoughlin Sarah Yin Jun Higgins Desmond G |
author_sort |
Kennedy Breandan |
title |
Integrating multiple genome annotation databases improves the interpretation of microarray gene expression data |
title_short |
Integrating multiple genome annotation databases improves the interpretation of microarray gene expression data |
title_full |
Integrating multiple genome annotation databases improves the interpretation of microarray gene expression data |
title_fullStr |
Integrating multiple genome annotation databases improves the interpretation of microarray gene expression data |
title_full_unstemmed |
Integrating multiple genome annotation databases improves the interpretation of microarray gene expression data |
title_sort |
integrating multiple genome annotation databases improves the interpretation of microarray gene expression data |
publisher |
BMC |
series |
BMC Genomics |
issn |
1471-2164 |
publishDate |
2010-01-01 |
description |
<p>Abstract</p> <p>Background</p> <p>The Affymetrix GeneChip is a widely used gene expression profiling platform. Since the chips were originally designed, the genome databases and gene definitions have been considerably updated. Thus, more accurate interpretation of microarray data requires parallel updating of the specificity of GeneChip probes. We propose a new probe remapping protocol, using the zebrafish GeneChips as an example, by removing nonspecific probes, and grouping the probes into transcript level probe sets using an integrated zebrafish genome annotation. This genome annotation is based on combining transcript information from multiple databases. This new remapping protocol, especially the new genome annotation, is shown here to be an important factor in improving the interpretation of gene expression microarray data.</p> <p>Results</p> <p>Transcript data from the RefSeq, GenBank and Ensembl databases were downloaded from the UCSC genome browser, and integrated to generate a combined zebrafish genome annotation. Affymetrix probes were filtered and remapped according to the new annotation. The influence of transcript collection and gene definition methods was tested using two microarray data sets. Compared to remapping using a single database, this new remapping protocol results in up to 20% more probes being retained in the remapping, leading to approximately 1,000 more genes being detected. The differentially expressed gene lists are consequently increased by up to 30%. We are also able to detect up to three times more alternative splicing events. A small number of the bioinformatics predictions were confirmed using real-time PCR validation.</p> <p>Conclusions</p> <p>By combining gene definitions from multiple databases, it is possible to greatly increase the numbers of genes and splice variants that can be detected in microarray gene expression experiments.</p> |
url |
http://www.biomedcentral.com/1471-2164/11/50 |
work_keys_str_mv |
AT kennedybreandan integratingmultiplegenomeannotationdatabasesimprovestheinterpretationofmicroarraygeneexpressiondata AT glavianoantonino integratingmultiplegenomeannotationdatabasesimprovestheinterpretationofmicroarraygeneexpressiondata AT jefferyianb integratingmultiplegenomeannotationdatabasesimprovestheinterpretationofmicroarraygeneexpressiondata AT mcloughlinsarah integratingmultiplegenomeannotationdatabasesimprovestheinterpretationofmicroarraygeneexpressiondata AT yinjun integratingmultiplegenomeannotationdatabasesimprovestheinterpretationofmicroarraygeneexpressiondata AT higginsdesmondg integratingmultiplegenomeannotationdatabasesimprovestheinterpretationofmicroarraygeneexpressiondata |
_version_ |
1725449927596703744 |