scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data
Abstract Background Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-04-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-021-04028-4 |
id |
doaj-67a08cdd0d1742028e2d238ecb65a93c |
---|---|
record_format |
Article |
spelling |
doaj-67a08cdd0d1742028e2d238ecb65a93c2021-04-18T11:51:44ZengBMCBMC Bioinformatics1471-21052021-04-0122111510.1186/s12859-021-04028-4scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing dataBobby Ranjan0Florian Schmidt1Wenjie Sun2Jinyu Park3Mohammad Amin Honardoost4Joanna Tan5Nirmala Arul Rayan6Shyam Prabhakar7Laboratory of Systems Biology and Data Analytics, Genome Institute of SingaporeLaboratory of Systems Biology and Data Analytics, Genome Institute of SingaporeLaboratory of Systems Biology and Data Analytics, Genome Institute of SingaporeLaboratory of Systems Biology and Data Analytics, Genome Institute of SingaporeLaboratory of Systems Biology and Data Analytics, Genome Institute of SingaporeLaboratory of Systems Biology and Data Analytics, Genome Institute of SingaporeLaboratory of Systems Biology and Data Analytics, Genome Institute of SingaporeLaboratory of Systems Biology and Data Analytics, Genome Institute of SingaporeAbstract Background Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering approaches have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. Results We present scConsensus, an $${\mathbf {R}}$$ R framework for generating a consensus clustering by (1) integrating results from both unsupervised and supervised approaches and (2) refining the consensus clusters using differentially expressed genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. Conclusions scConsensus combines the merits of unsupervised and supervised approaches to partition cells with better cluster separation and homogeneity, thereby increasing our confidence in detecting distinct cell types. scConsensus is implemented in $${\mathbf {R}}$$ R and is freely available on GitHub at https://github.com/prabhakarlab/scConsensus .https://doi.org/10.1186/s12859-021-04028-4ScRNA-seqClusteringCell type annotationConsensus method |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Bobby Ranjan Florian Schmidt Wenjie Sun Jinyu Park Mohammad Amin Honardoost Joanna Tan Nirmala Arul Rayan Shyam Prabhakar |
spellingShingle |
Bobby Ranjan Florian Schmidt Wenjie Sun Jinyu Park Mohammad Amin Honardoost Joanna Tan Nirmala Arul Rayan Shyam Prabhakar scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data BMC Bioinformatics ScRNA-seq Clustering Cell type annotation Consensus method |
author_facet |
Bobby Ranjan Florian Schmidt Wenjie Sun Jinyu Park Mohammad Amin Honardoost Joanna Tan Nirmala Arul Rayan Shyam Prabhakar |
author_sort |
Bobby Ranjan |
title |
scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_short |
scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_full |
scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_fullStr |
scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_full_unstemmed |
scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data |
title_sort |
scconsensus: combining supervised and unsupervised clustering for cell type identification in single-cell rna sequencing data |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2021-04-01 |
description |
Abstract Background Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering approaches have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. Results We present scConsensus, an $${\mathbf {R}}$$ R framework for generating a consensus clustering by (1) integrating results from both unsupervised and supervised approaches and (2) refining the consensus clusters using differentially expressed genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. Conclusions scConsensus combines the merits of unsupervised and supervised approaches to partition cells with better cluster separation and homogeneity, thereby increasing our confidence in detecting distinct cell types. scConsensus is implemented in $${\mathbf {R}}$$ R and is freely available on GitHub at https://github.com/prabhakarlab/scConsensus . |
topic |
ScRNA-seq Clustering Cell type annotation Consensus method |
url |
https://doi.org/10.1186/s12859-021-04028-4 |
work_keys_str_mv |
AT bobbyranjan scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT florianschmidt scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT wenjiesun scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT jinyupark scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT mohammadaminhonardoost scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT joannatan scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT nirmalaarulrayan scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata AT shyamprabhakar scconsensuscombiningsupervisedandunsupervisedclusteringforcelltypeidentificationinsinglecellrnasequencingdata |
_version_ |
1721521826336604160 |