CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq

The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is howev...

Full description

Bibliographic Details
Main Authors: Wenbo Yu, Ahmed Mahfouz, Marcel J. T. Reinders
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-04-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2021.644211/full
id doaj-7be88d0d0a2e484fbcc2ada21e730b85
record_format Article
spelling doaj-7be88d0d0a2e484fbcc2ada21e730b852021-04-13T06:36:42ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-04-011210.3389/fgene.2021.644211644211CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seqWenbo Yu0Wenbo Yu1Ahmed Mahfouz2Ahmed Mahfouz3Ahmed Mahfouz4Marcel J. T. Reinders5Marcel J. T. Reinders6Marcel J. T. Reinders7Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, ChinaDelft Bioinformatics Lab, Delft University of Technology, Delft, NetherlandsDelft Bioinformatics Lab, Delft University of Technology, Delft, NetherlandsLeiden Computational Biology Center, Leiden University Medical Center, Leiden, NetherlandsDepartment of Human Genetics, Leiden University Medical Center, Leiden, NetherlandsDelft Bioinformatics Lab, Delft University of Technology, Delft, NetherlandsLeiden Computational Biology Center, Leiden University Medical Center, Leiden, NetherlandsDepartment of Human Genetics, Leiden University Medical Center, Leiden, NetherlandsThe power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.https://www.frontiersin.org/articles/10.3389/fgene.2021.644211/fullbatch correctionauto-encodersingle-cell RNA sequencingclusteringdata integration
collection DOAJ
language English
format Article
sources DOAJ
author Wenbo Yu
Wenbo Yu
Ahmed Mahfouz
Ahmed Mahfouz
Ahmed Mahfouz
Marcel J. T. Reinders
Marcel J. T. Reinders
Marcel J. T. Reinders
spellingShingle Wenbo Yu
Wenbo Yu
Ahmed Mahfouz
Ahmed Mahfouz
Ahmed Mahfouz
Marcel J. T. Reinders
Marcel J. T. Reinders
Marcel J. T. Reinders
CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
Frontiers in Genetics
batch correction
auto-encoder
single-cell RNA sequencing
clustering
data integration
author_facet Wenbo Yu
Wenbo Yu
Ahmed Mahfouz
Ahmed Mahfouz
Ahmed Mahfouz
Marcel J. T. Reinders
Marcel J. T. Reinders
Marcel J. T. Reinders
author_sort Wenbo Yu
title CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_short CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_full CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_fullStr CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_full_unstemmed CBA: Cluster-Guided Batch Alignment for Single Cell RNA-seq
title_sort cba: cluster-guided batch alignment for single cell rna-seq
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2021-04-01
description The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.
topic batch correction
auto-encoder
single-cell RNA sequencing
clustering
data integration
url https://www.frontiersin.org/articles/10.3389/fgene.2021.644211/full
work_keys_str_mv AT wenboyu cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT wenboyu cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT ahmedmahfouz cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT ahmedmahfouz cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT ahmedmahfouz cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT marceljtreinders cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT marceljtreinders cbaclusterguidedbatchalignmentforsinglecellrnaseq
AT marceljtreinders cbaclusterguidedbatchalignmentforsinglecellrnaseq
_version_ 1721529270441869312