BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.

Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine hetero...

Full description

Bibliographic Details
Main Authors: Catalina A Vallejos, John C Marioni, Sylvia Richardson
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-06-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1004333
id doaj-7c973ae27dc440d8b43be811ddec63ba
record_format Article
spelling doaj-7c973ae27dc440d8b43be811ddec63ba2021-04-21T15:00:14ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582015-06-01116e100433310.1371/journal.pcbi.1004333BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.Catalina A VallejosJohn C MarioniSylvia RichardsonSingle-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell's lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach.https://doi.org/10.1371/journal.pcbi.1004333
collection DOAJ
language English
format Article
sources DOAJ
author Catalina A Vallejos
John C Marioni
Sylvia Richardson
spellingShingle Catalina A Vallejos
John C Marioni
Sylvia Richardson
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
PLoS Computational Biology
author_facet Catalina A Vallejos
John C Marioni
Sylvia Richardson
author_sort Catalina A Vallejos
title BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
title_short BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
title_full BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
title_fullStr BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
title_full_unstemmed BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
title_sort basics: bayesian analysis of single-cell sequencing data.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2015-06-01
description Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell's lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach.
url https://doi.org/10.1371/journal.pcbi.1004333
work_keys_str_mv AT catalinaavallejos basicsbayesiananalysisofsinglecellsequencingdata
AT johncmarioni basicsbayesiananalysisofsinglecellsequencingdata
AT sylviarichardson basicsbayesiananalysisofsinglecellsequencingdata
_version_ 1714668046125105152