CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is howev...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2021-04-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fgene.2021.644211/full |
id |
doaj-7be88d0d0a2e484fbcc2ada21e730b85 |
---|---|
record_format |
Article |
spelling |
doaj-7be88d0d0a2e484fbcc2ada21e730b852021-04-13T06:36:42ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-04-011210.3389/fgene.2021.644211644211CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seqWenbo Yu0Wenbo Yu1Ahmed Mahfouz2Ahmed Mahfouz3Ahmed Mahfouz4Marcel J. T. Reinders5Marcel J. T. Reinders6Marcel J. T. Reinders7Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, ChinaDelft Bioinformatics Lab, Delft University of Technology, Delft, NetherlandsDelft Bioinformatics Lab, Delft University of Technology, Delft, NetherlandsLeiden Computational Biology Center, Leiden University Medical Center, Leiden, NetherlandsDepartment of Human Genetics, Leiden University Medical Center, Leiden, NetherlandsDelft Bioinformatics Lab, Delft University of Technology, Delft, NetherlandsLeiden Computational Biology Center, Leiden University Medical Center, Leiden, NetherlandsDepartment of Human Genetics, Leiden University Medical Center, Leiden, NetherlandsThe power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.https://www.frontiersin.org/articles/10.3389/fgene.2021.644211/fullbatch correctionauto-encodersingle-cell RNA sequencingclusteringdata integration |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Wenbo Yu Wenbo Yu Ahmed Mahfouz Ahmed Mahfouz Ahmed Mahfouz Marcel J. T. Reinders Marcel J. T. Reinders Marcel J. T. Reinders |
spellingShingle |
Wenbo Yu Wenbo Yu Ahmed Mahfouz Ahmed Mahfouz Ahmed Mahfouz Marcel J. T. Reinders Marcel J. T. Reinders Marcel J. T. Reinders CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq Frontiers in Genetics batch correction auto-encoder single-cell RNA sequencing clustering data integration |
author_facet |
Wenbo Yu Wenbo Yu Ahmed Mahfouz Ahmed Mahfouz Ahmed Mahfouz Marcel J. T. Reinders Marcel J. T. Reinders Marcel J. T. Reinders |
author_sort |
Wenbo Yu |
title |
CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq |
title_short |
CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq |
title_full |
CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq |
title_fullStr |
CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq |
title_full_unstemmed |
CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq |
title_sort |
cba: cluster-guided batch alignment for single cell rna-seq |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Genetics |
issn |
1664-8021 |
publishDate |
2021-04-01 |
description |
The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters. |
topic |
batch correction auto-encoder single-cell RNA sequencing clustering data integration |
url |
https://www.frontiersin.org/articles/10.3389/fgene.2021.644211/full |
work_keys_str_mv |
AT wenboyu cbaclusterguidedbatchalignmentforsinglecellrnaseq AT wenboyu cbaclusterguidedbatchalignmentforsinglecellrnaseq AT ahmedmahfouz cbaclusterguidedbatchalignmentforsinglecellrnaseq AT ahmedmahfouz cbaclusterguidedbatchalignmentforsinglecellrnaseq AT ahmedmahfouz cbaclusterguidedbatchalignmentforsinglecellrnaseq AT marceljtreinders cbaclusterguidedbatchalignmentforsinglecellrnaseq AT marceljtreinders cbaclusterguidedbatchalignmentforsinglecellrnaseq AT marceljtreinders cbaclusterguidedbatchalignmentforsinglecellrnaseq |
_version_ |
1721529270441869312 |