Evaluation of Methods in Removing Batch Effects on RNA-seq Data
It is common and advantageous for researchers to combine RNA-seq data from similar studies to increase statistical power in genomics analysis. However the unwanted noise and hidden artifacts such as batch effects could dramatically reduce the accuracy of statistical inference. The performance of thr...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
International Biological and Medical Journals Publishing House Co., Limited
2016-04-01
|
Series: | Infectious Diseases and Translational Medicine |
Subjects: | |
Online Access: | http://www.tran-med.com/EN/abstract/abstract24.shtml |
id |
doaj-ca25b72d328543f78d74fba0ce5b8889 |
---|---|
record_format |
Article |
spelling |
doaj-ca25b72d328543f78d74fba0ce5b88892020-11-25T00:24:45ZengInternational Biological and Medical Journals Publishing House Co., LimitedInfectious Diseases and Translational Medicine 2411-29172411-29172016-04-01213910.11979/idtm.201601002Evaluation of Methods in Removing Batch Effects on RNA-seq DataQian Liu0Marianthi Markatou1Department of Biostatistics, School of Public Health and Health Professions, University at Buffalo, SUNY. Buffalo, NY 14214 Department of Biostatistics, School of Public Health and Health Professions, and Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, SUNY. Buffalo, NY 14214 It is common and advantageous for researchers to combine RNA-seq data from similar studies to increase statistical power in genomics analysis. However the unwanted noise and hidden artifacts such as batch effects could dramatically reduce the accuracy of statistical inference. The performance of three different methods, SVA, ComBat and PCA, for correcting batch effects in RNA-seq data is evaluated. Two simulation dataset are generated to mimic real data in a common RNA-seq experiment. The results show the SVA method has the best performance, while the ComBat method over-corrects the batch effect. Most importantly, a carefully designed experiment, which optimizes the even distribution of samples in different batches, could minimize the confounding or correlation between batches and thus lead to unbiased results.http://www.tran-med.com/EN/abstract/abstract24.shtmlRNA-seqBatch effectsSVA |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Qian Liu Marianthi Markatou |
spellingShingle |
Qian Liu Marianthi Markatou Evaluation of Methods in Removing Batch Effects on RNA-seq Data Infectious Diseases and Translational Medicine RNA-seq Batch effects SVA |
author_facet |
Qian Liu Marianthi Markatou |
author_sort |
Qian Liu |
title |
Evaluation of Methods in Removing Batch Effects on RNA-seq Data |
title_short |
Evaluation of Methods in Removing Batch Effects on RNA-seq Data |
title_full |
Evaluation of Methods in Removing Batch Effects on RNA-seq Data |
title_fullStr |
Evaluation of Methods in Removing Batch Effects on RNA-seq Data |
title_full_unstemmed |
Evaluation of Methods in Removing Batch Effects on RNA-seq Data |
title_sort |
evaluation of methods in removing batch effects on rna-seq data |
publisher |
International Biological and Medical Journals Publishing House Co., Limited |
series |
Infectious Diseases and Translational Medicine |
issn |
2411-2917 2411-2917 |
publishDate |
2016-04-01 |
description |
It is common and advantageous for researchers to combine RNA-seq data from similar studies to increase statistical power in genomics analysis. However the unwanted noise and hidden artifacts such as batch effects could dramatically reduce the accuracy of statistical inference. The performance of three different methods, SVA, ComBat and PCA, for correcting batch effects in RNA-seq data is evaluated. Two simulation dataset are generated to mimic real data in a common RNA-seq experiment. The results show the SVA method has the best performance, while the ComBat method over-corrects the batch effect. Most importantly, a carefully designed experiment, which optimizes the even distribution of samples in different batches, could minimize the confounding or correlation between batches and thus lead to unbiased results. |
topic |
RNA-seq Batch effects SVA |
url |
http://www.tran-med.com/EN/abstract/abstract24.shtml |
work_keys_str_mv |
AT qianliu evaluationofmethodsinremovingbatcheffectsonrnaseqdata AT marianthimarkatou evaluationofmethodsinremovingbatcheffectsonrnaseqdata |
_version_ |
1725351988925825024 |