Evaluation of Methods in Removing Batch Effects on RNA-seq Data
It is common and advantageous for researchers to combine RNA-seq data from similar studies to increase statistical power in genomics analysis. However the unwanted noise and hidden artifacts such as batch effects could dramatically reduce the accuracy of statistical inference. The performance of thr...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
International Biological and Medical Journals Publishing House Co., Limited
2016-04-01
|
Series: | Infectious Diseases and Translational Medicine |
Subjects: | |
Online Access: | http://www.tran-med.com/EN/abstract/abstract24.shtml |
Summary: | It is common and advantageous for researchers to combine RNA-seq data from similar studies to increase statistical power in genomics analysis. However the unwanted noise and hidden artifacts such as batch effects could dramatically reduce the accuracy of statistical inference. The performance of three different methods, SVA, ComBat and PCA, for correcting batch effects in RNA-seq data is evaluated. Two simulation dataset are generated to mimic real data in a common RNA-seq experiment. The results show the SVA method has the best performance, while the ComBat method over-corrects the batch effect. Most importantly, a carefully designed experiment, which optimizes the even distribution of samples in different batches, could minimize the confounding or correlation between batches and thus lead to unbiased results. |
---|---|
ISSN: | 2411-2917 2411-2917 |