CytoGLMM: conditional differential analysis for flow and mass cytometry experiments
Abstract Background Flow and mass cytometry are important modern immunology tools for measuring expression levels of multiple proteins on single cells. The goal is to better understand the mechanisms of responses on a single cell basis by studying differential expression of proteins. Most current da...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2021-03-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-021-04067-x |
id |
doaj-27cc448bdff14d00a6742280b3778089 |
---|---|
record_format |
Article |
spelling |
doaj-27cc448bdff14d00a6742280b37780892021-03-28T11:46:17ZengBMCBMC Bioinformatics1471-21052021-03-0122111410.1186/s12859-021-04067-xCytoGLMM: conditional differential analysis for flow and mass cytometry experimentsChristof Seiler0Anne-Maud Ferreira1Lisa M. Kronstad2Laura J. Simpson3Mathieu Le Gars4Elena Vendrame5Catherine A. Blish6Susan Holmes7Department of Data Science and Knowledge Engineering, Maastricht UniversityDepartment of Statistics, Stanford UniversityImmunology Program, Stanford University School of MedicineImmunology Program, Stanford University School of MedicineImmunology Program, Stanford University School of MedicineImmunology Program, Stanford University School of MedicineImmunology Program, Stanford University School of MedicineDepartment of Statistics, Stanford UniversityAbstract Background Flow and mass cytometry are important modern immunology tools for measuring expression levels of multiple proteins on single cells. The goal is to better understand the mechanisms of responses on a single cell basis by studying differential expression of proteins. Most current data analysis tools compare expressions across many computationally discovered cell types. Our goal is to focus on just one cell type. Our narrower field of application allows us to define a more specific statistical model with easier to control statistical guarantees. Results Differential analysis of marker expressions can be difficult due to marker correlations and inter-subject heterogeneity, particularly for studies of human immunology. We address these challenges with two multiple regression strategies: a bootstrapped generalized linear model and a generalized linear mixed model. On simulated datasets, we compare the robustness towards marker correlations and heterogeneity of both strategies. For paired experiments, we find that both strategies maintain the target false discovery rate under medium correlations and that mixed models are statistically more powerful under the correct model specification. For unpaired experiments, our results indicate that much larger patient sample sizes are required to detect differences. We illustrate the CytoGLMM R package and workflow for both strategies on a pregnancy dataset. Conclusion Our approach to finding differential proteins in flow and mass cytometry data reduces biases arising from marker correlations and safeguards against false discoveries induced by patient heterogeneity.https://doi.org/10.1186/s12859-021-04067-xHigh-dimensional cytometryGeneralized linear modelsGeneralized linear mixed models |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Christof Seiler Anne-Maud Ferreira Lisa M. Kronstad Laura J. Simpson Mathieu Le Gars Elena Vendrame Catherine A. Blish Susan Holmes |
spellingShingle |
Christof Seiler Anne-Maud Ferreira Lisa M. Kronstad Laura J. Simpson Mathieu Le Gars Elena Vendrame Catherine A. Blish Susan Holmes CytoGLMM: conditional differential analysis for flow and mass cytometry experiments BMC Bioinformatics High-dimensional cytometry Generalized linear models Generalized linear mixed models |
author_facet |
Christof Seiler Anne-Maud Ferreira Lisa M. Kronstad Laura J. Simpson Mathieu Le Gars Elena Vendrame Catherine A. Blish Susan Holmes |
author_sort |
Christof Seiler |
title |
CytoGLMM: conditional differential analysis for flow and mass cytometry experiments |
title_short |
CytoGLMM: conditional differential analysis for flow and mass cytometry experiments |
title_full |
CytoGLMM: conditional differential analysis for flow and mass cytometry experiments |
title_fullStr |
CytoGLMM: conditional differential analysis for flow and mass cytometry experiments |
title_full_unstemmed |
CytoGLMM: conditional differential analysis for flow and mass cytometry experiments |
title_sort |
cytoglmm: conditional differential analysis for flow and mass cytometry experiments |
publisher |
BMC |
series |
BMC Bioinformatics |
issn |
1471-2105 |
publishDate |
2021-03-01 |
description |
Abstract Background Flow and mass cytometry are important modern immunology tools for measuring expression levels of multiple proteins on single cells. The goal is to better understand the mechanisms of responses on a single cell basis by studying differential expression of proteins. Most current data analysis tools compare expressions across many computationally discovered cell types. Our goal is to focus on just one cell type. Our narrower field of application allows us to define a more specific statistical model with easier to control statistical guarantees. Results Differential analysis of marker expressions can be difficult due to marker correlations and inter-subject heterogeneity, particularly for studies of human immunology. We address these challenges with two multiple regression strategies: a bootstrapped generalized linear model and a generalized linear mixed model. On simulated datasets, we compare the robustness towards marker correlations and heterogeneity of both strategies. For paired experiments, we find that both strategies maintain the target false discovery rate under medium correlations and that mixed models are statistically more powerful under the correct model specification. For unpaired experiments, our results indicate that much larger patient sample sizes are required to detect differences. We illustrate the CytoGLMM R package and workflow for both strategies on a pregnancy dataset. Conclusion Our approach to finding differential proteins in flow and mass cytometry data reduces biases arising from marker correlations and safeguards against false discoveries induced by patient heterogeneity. |
topic |
High-dimensional cytometry Generalized linear models Generalized linear mixed models |
url |
https://doi.org/10.1186/s12859-021-04067-x |
work_keys_str_mv |
AT christofseiler cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments AT annemaudferreira cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments AT lisamkronstad cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments AT laurajsimpson cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments AT mathieulegars cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments AT elenavendrame cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments AT catherineablish cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments AT susanholmes cytoglmmconditionaldifferentialanalysisforflowandmasscytometryexperiments |
_version_ |
1724199589211275264 |