Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for <it>C. elegans</it>

<p>Abstract</p> <p>Background</p> <p>Various methods for estimating protein expression levels are known. The level of correlation between these methods is only fair, and systematic biases in each of the methods cannot be ruled out. We here investigate systematic biases...

Full description

Bibliographic Details
Main Authors: Munoz Enrique T, Bogarad Leonard D, Deem Michael W
Format: Article
Language:English
Published: BMC 2004-05-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/5/30
id doaj-91fb35523a134c47925730ffe4d13667
record_format Article
spelling doaj-91fb35523a134c47925730ffe4d136672020-11-25T00:09:33ZengBMCBMC Genomics1471-21642004-05-01513010.1186/1471-2164-5-30Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for <it>C. elegans</it>Munoz Enrique TBogarad Leonard DDeem Michael W<p>Abstract</p> <p>Background</p> <p>Various methods for estimating protein expression levels are known. The level of correlation between these methods is only fair, and systematic biases in each of the methods cannot be ruled out. We here investigate systematic biases in the estimation of gene expression rates from microarray data and from abundance within the Expressed Sequence Tag (EST) database. We suggest that length is a significant factor in biases to measured gene expression rates.</p> <p>As a specific example of the importance of the bias of expression rate with length, we address the following evolutionary question: Does the average <it>C. elegans </it>protein length increase or decrease with expression level? Two different answers to this question have been reported in the literature, one method using expression levels estimated by abundance within the EST database and another using microarrays. We have investigated this issue by constructing the full protein length versus expression curve for <it>C. elegans</it>, using both methods for estimating expression levels.</p> <p>Results</p> <p>The microarray data show a monotonic decrease of length with expression level, whereas the abundance within the EST database data show a non-monotonic behavior. Furthermore, the ratio of the expression level estimated by the EST database to that measured by microarrays is not constant, but rather systematically biased with gene length.</p> <p>Conclusions</p> <p>It is suggested that the length bias may lie primarily in the abundance within the EST database method, being not ameliorated by internal standards as it is in the microarray data, and that this bias should be removed before data interpretation. When this is done, both the microarray and the abundance within the EST database give a monotonic decrease of spliced length with expression level, and the correlation between the EST and microarray data becomes larger. We suggest that standard RNA controls be used to normalize for length bias in any method that measures expression.</p> http://www.biomedcentral.com/1471-2164/5/30
collection DOAJ
language English
format Article
sources DOAJ
author Munoz Enrique T
Bogarad Leonard D
Deem Michael W
spellingShingle Munoz Enrique T
Bogarad Leonard D
Deem Michael W
Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for <it>C. elegans</it>
BMC Genomics
author_facet Munoz Enrique T
Bogarad Leonard D
Deem Michael W
author_sort Munoz Enrique T
title Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for <it>C. elegans</it>
title_short Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for <it>C. elegans</it>
title_full Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for <it>C. elegans</it>
title_fullStr Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for <it>C. elegans</it>
title_full_unstemmed Microarray and EST database estimates of mRNA expression levels differ: The protein length versus expression curve for <it>C. elegans</it>
title_sort microarray and est database estimates of mrna expression levels differ: the protein length versus expression curve for <it>c. elegans</it>
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2004-05-01
description <p>Abstract</p> <p>Background</p> <p>Various methods for estimating protein expression levels are known. The level of correlation between these methods is only fair, and systematic biases in each of the methods cannot be ruled out. We here investigate systematic biases in the estimation of gene expression rates from microarray data and from abundance within the Expressed Sequence Tag (EST) database. We suggest that length is a significant factor in biases to measured gene expression rates.</p> <p>As a specific example of the importance of the bias of expression rate with length, we address the following evolutionary question: Does the average <it>C. elegans </it>protein length increase or decrease with expression level? Two different answers to this question have been reported in the literature, one method using expression levels estimated by abundance within the EST database and another using microarrays. We have investigated this issue by constructing the full protein length versus expression curve for <it>C. elegans</it>, using both methods for estimating expression levels.</p> <p>Results</p> <p>The microarray data show a monotonic decrease of length with expression level, whereas the abundance within the EST database data show a non-monotonic behavior. Furthermore, the ratio of the expression level estimated by the EST database to that measured by microarrays is not constant, but rather systematically biased with gene length.</p> <p>Conclusions</p> <p>It is suggested that the length bias may lie primarily in the abundance within the EST database method, being not ameliorated by internal standards as it is in the microarray data, and that this bias should be removed before data interpretation. When this is done, both the microarray and the abundance within the EST database give a monotonic decrease of spliced length with expression level, and the correlation between the EST and microarray data becomes larger. We suggest that standard RNA controls be used to normalize for length bias in any method that measures expression.</p>
url http://www.biomedcentral.com/1471-2164/5/30
work_keys_str_mv AT munozenriquet microarrayandestdatabaseestimatesofmrnaexpressionlevelsdiffertheproteinlengthversusexpressioncurveforitcelegansit
AT bogaradleonardd microarrayandestdatabaseestimatesofmrnaexpressionlevelsdiffertheproteinlengthversusexpressioncurveforitcelegansit
AT deemmichaelw microarrayandestdatabaseestimatesofmrnaexpressionlevelsdiffertheproteinlengthversusexpressioncurveforitcelegansit
_version_ 1725411361832304640