Effect of Normalization on Statistical and Biological Interpretation of Gene Expression Profiles

A neglected aspect of the genetic analysis of gene expression is the impact of normalization on biological inference. Here we contrast nine different methods for normalization of an Illumina bead-array gene expression profiling dataset consisting of peripheral blood samples from 189 individual part...

Full description

Bibliographic Details
Main Authors: Shaopu Peter Qin, Jinhee eKim, Dalia eArafat, Greg eGibson
Format: Article
Language:English
Published: Frontiers Media S.A. 2013-05-01
Series:Frontiers in Genetics
Subjects:
SNM
Online Access:http://journal.frontiersin.org/Journal/10.3389/fgene.2012.00160/full
Description
Summary:A neglected aspect of the genetic analysis of gene expression is the impact of normalization on biological inference. Here we contrast nine different methods for normalization of an Illumina bead-array gene expression profiling dataset consisting of peripheral blood samples from 189 individual participants in the Center for Health Discovery and Well-Being (CHDWB) study in Atlanta, quantifying differences in the inference of global variance components and covariance of gene expression, as well as the detection of eSNPs. The normalization strategies, all relative to raw log2 measures, include simple mean centering, two modes of transcript-level linear adjustment for technical factors, and for differential immune cell counts, variance normalization by inter-quartile range and by quantile, fitting the first 16 Principal Components, and supervised normalization using the SNM procedure with adjustment for cell counts. Robustness of genetic associations as a consequence of Pearson and Spearman rank correlation is also reported for each method, and it shown that the normalization strategy has a far greater impact than correlation method. We describe similarities among methods, discuss the impact on biological interpretation, and make recommendations regarding appropriate strategies.
ISSN:1664-8021