Evaluation of Methods in Removing Batch Effects on RNA-seq Data

It is common and advantageous for researchers to combine RNA-seq data from similar studies to increase statistical power in genomics analysis. However the unwanted noise and hidden artifacts such as batch effects could dramatically reduce the accuracy of statistical inference. The performance of thr...

Full description

Bibliographic Details
Main Authors: Qian Liu, Marianthi Markatou
Format: Article
Language:English
Published: International Biological and Medical Journals Publishing House Co., Limited 2016-04-01
Series:Infectious Diseases and Translational Medicine
Subjects:
SVA
Online Access:http://www.tran-med.com/EN/abstract/abstract24.shtml
Description
Summary:It is common and advantageous for researchers to combine RNA-seq data from similar studies to increase statistical power in genomics analysis. However the unwanted noise and hidden artifacts such as batch effects could dramatically reduce the accuracy of statistical inference. The performance of three different methods, SVA, ComBat and PCA, for correcting batch effects in RNA-seq data is evaluated. Two simulation dataset are generated to mimic real data in a common RNA-seq experiment. The results show the SVA method has the best performance, while the ComBat method over-corrects the batch effect. Most importantly, a carefully designed experiment, which optimizes the even distribution of samples in different batches, could minimize the confounding or correlation between batches and thus lead to unbiased results.
ISSN:2411-2917
2411-2917