Deconvolution of heterogeneous tumor samples using partial reference signals.

Deconvolution of heterogeneous bulk tumor samples into distinct cellular populations is an important yet challenging problem, particularly when only partial references are available. A common approach to dealing with this problem is to deconvolve the mixed signals using available references and leve...

Full description

Bibliographic Details
Main Authors: Yufang Qin, Weiwei Zhang, Xiaoqiang Sun, Siwei Nan, Nana Wei, Hua-Jun Wu, Xiaoqi Zheng
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-11-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1008452
id doaj-704af6b68aa44856ace7f974c1e43838
record_format Article
spelling doaj-704af6b68aa44856ace7f974c1e438382021-04-21T15:45:41ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582020-11-011611e100845210.1371/journal.pcbi.1008452Deconvolution of heterogeneous tumor samples using partial reference signals.Yufang QinWeiwei ZhangXiaoqiang SunSiwei NanNana WeiHua-Jun WuXiaoqi ZhengDeconvolution of heterogeneous bulk tumor samples into distinct cellular populations is an important yet challenging problem, particularly when only partial references are available. A common approach to dealing with this problem is to deconvolve the mixed signals using available references and leverage the remaining signal as a new cell component. However, as indicated in our simulation, such an approach tends to over-estimate the proportions of known cell types and fails to detect novel cell types. Here, we propose PREDE, a partial reference-based deconvolution method using an iterative non-negative matrix factorization algorithm. Our method is verified to be effective in estimating cell proportions and expression profiles of unknown cell types based on simulated datasets at a variety of parameter settings. Applying our method to TCGA tumor samples, we found that proportions of pure cancer cells better indicate different subtypes of tumor samples. We also detected several cell types for each cancer type whose proportions successfully predicted patient survival. Our method makes a significant contribution to deconvolution of heterogeneous tumor samples and could be widely applied to varieties of high throughput bulk data. PREDE is implemented in R and is freely available from GitHub (https://xiaoqizheng.github.io/PREDE).https://doi.org/10.1371/journal.pcbi.1008452
collection DOAJ
language English
format Article
sources DOAJ
author Yufang Qin
Weiwei Zhang
Xiaoqiang Sun
Siwei Nan
Nana Wei
Hua-Jun Wu
Xiaoqi Zheng
spellingShingle Yufang Qin
Weiwei Zhang
Xiaoqiang Sun
Siwei Nan
Nana Wei
Hua-Jun Wu
Xiaoqi Zheng
Deconvolution of heterogeneous tumor samples using partial reference signals.
PLoS Computational Biology
author_facet Yufang Qin
Weiwei Zhang
Xiaoqiang Sun
Siwei Nan
Nana Wei
Hua-Jun Wu
Xiaoqi Zheng
author_sort Yufang Qin
title Deconvolution of heterogeneous tumor samples using partial reference signals.
title_short Deconvolution of heterogeneous tumor samples using partial reference signals.
title_full Deconvolution of heterogeneous tumor samples using partial reference signals.
title_fullStr Deconvolution of heterogeneous tumor samples using partial reference signals.
title_full_unstemmed Deconvolution of heterogeneous tumor samples using partial reference signals.
title_sort deconvolution of heterogeneous tumor samples using partial reference signals.
publisher Public Library of Science (PLoS)
series PLoS Computational Biology
issn 1553-734X
1553-7358
publishDate 2020-11-01
description Deconvolution of heterogeneous bulk tumor samples into distinct cellular populations is an important yet challenging problem, particularly when only partial references are available. A common approach to dealing with this problem is to deconvolve the mixed signals using available references and leverage the remaining signal as a new cell component. However, as indicated in our simulation, such an approach tends to over-estimate the proportions of known cell types and fails to detect novel cell types. Here, we propose PREDE, a partial reference-based deconvolution method using an iterative non-negative matrix factorization algorithm. Our method is verified to be effective in estimating cell proportions and expression profiles of unknown cell types based on simulated datasets at a variety of parameter settings. Applying our method to TCGA tumor samples, we found that proportions of pure cancer cells better indicate different subtypes of tumor samples. We also detected several cell types for each cancer type whose proportions successfully predicted patient survival. Our method makes a significant contribution to deconvolution of heterogeneous tumor samples and could be widely applied to varieties of high throughput bulk data. PREDE is implemented in R and is freely available from GitHub (https://xiaoqizheng.github.io/PREDE).
url https://doi.org/10.1371/journal.pcbi.1008452
work_keys_str_mv AT yufangqin deconvolutionofheterogeneoustumorsamplesusingpartialreferencesignals
AT weiweizhang deconvolutionofheterogeneoustumorsamplesusingpartialreferencesignals
AT xiaoqiangsun deconvolutionofheterogeneoustumorsamplesusingpartialreferencesignals
AT siweinan deconvolutionofheterogeneoustumorsamplesusingpartialreferencesignals
AT nanawei deconvolutionofheterogeneoustumorsamplesusingpartialreferencesignals
AT huajunwu deconvolutionofheterogeneoustumorsamplesusingpartialreferencesignals
AT xiaoqizheng deconvolutionofheterogeneoustumorsamplesusingpartialreferencesignals
_version_ 1714667024735535104