Consensus clustering applied to multi-omics disease subtyping

Abstract Background Facing the diversity of omics data and the difficulty of selecting one result over all those produced by several methods, consensus strategies have the potential to reconcile multiple inputs and to produce robust results. Results Here, we introduce ClustOmics, a generic consensus...

Full description

Bibliographic Details
Main Authors: Galadriel Brière, Élodie Darbo, Patricia Thébault, Raluca Uricaru
Format: Article
Language:English
Published: BMC 2021-07-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-021-04279-1
id doaj-740f6253a4c542d996777314ffecd3b0
record_format Article
spelling doaj-740f6253a4c542d996777314ffecd3b02021-07-11T11:14:44ZengBMCBMC Bioinformatics1471-21052021-07-0122112910.1186/s12859-021-04279-1Consensus clustering applied to multi-omics disease subtypingGaladriel Brière0Élodie Darbo1Patricia Thébault2Raluca Uricaru3CNRS, Bordeaux INP, LaBRI, UMR 5800, Univ. BordeauxCNRS, Bordeaux INP, LaBRI, UMR 5800, Univ. BordeauxCNRS, Bordeaux INP, LaBRI, UMR 5800, Univ. BordeauxCNRS, Bordeaux INP, LaBRI, UMR 5800, Univ. BordeauxAbstract Background Facing the diversity of omics data and the difficulty of selecting one result over all those produced by several methods, consensus strategies have the potential to reconcile multiple inputs and to produce robust results. Results Here, we introduce ClustOmics, a generic consensus clustering tool that we use in the context of cancer subtyping. ClustOmics relies on a non-relational graph database, which allows for the simultaneous integration of both multiple omics data and results from various clustering methods. This new tool conciliates input clusterings, regardless of their origin, their number, their size or their shape. ClustOmics implements an intuitive and flexible strategy, based upon the idea of evidence accumulation clustering. ClustOmics computes co-occurrences of pairs of samples in input clusters and uses this score as a similarity measure to reorganize data into consensus clusters. Conclusion We applied ClustOmics to multi-omics disease subtyping on real TCGA cancer data from ten different cancer types. We showed that ClustOmics is robust to heterogeneous qualities of input partitions, smoothing and reconciling preliminary predictions into high-quality consensus clusters, both from a computational and a biological point of view. The comparison to a state-of-the-art consensus-based integration tool, COCA, further corroborated this statement. However, the main interest of ClustOmics is not to compete with other tools, but rather to make profit from their various predictions when no gold-standard metric is available to assess their significance. Availability The ClustOmics source code, released under MIT license, and the results obtained on TCGA cancer data are available on GitHub: https://github.com/galadrielbriere/ClustOmics .https://doi.org/10.1186/s12859-021-04279-1Disease subtypingMulti-omic dataData integrationConsensus clustering
collection DOAJ
language English
format Article
sources DOAJ
author Galadriel Brière
Élodie Darbo
Patricia Thébault
Raluca Uricaru
spellingShingle Galadriel Brière
Élodie Darbo
Patricia Thébault
Raluca Uricaru
Consensus clustering applied to multi-omics disease subtyping
BMC Bioinformatics
Disease subtyping
Multi-omic data
Data integration
Consensus clustering
author_facet Galadriel Brière
Élodie Darbo
Patricia Thébault
Raluca Uricaru
author_sort Galadriel Brière
title Consensus clustering applied to multi-omics disease subtyping
title_short Consensus clustering applied to multi-omics disease subtyping
title_full Consensus clustering applied to multi-omics disease subtyping
title_fullStr Consensus clustering applied to multi-omics disease subtyping
title_full_unstemmed Consensus clustering applied to multi-omics disease subtyping
title_sort consensus clustering applied to multi-omics disease subtyping
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2021-07-01
description Abstract Background Facing the diversity of omics data and the difficulty of selecting one result over all those produced by several methods, consensus strategies have the potential to reconcile multiple inputs and to produce robust results. Results Here, we introduce ClustOmics, a generic consensus clustering tool that we use in the context of cancer subtyping. ClustOmics relies on a non-relational graph database, which allows for the simultaneous integration of both multiple omics data and results from various clustering methods. This new tool conciliates input clusterings, regardless of their origin, their number, their size or their shape. ClustOmics implements an intuitive and flexible strategy, based upon the idea of evidence accumulation clustering. ClustOmics computes co-occurrences of pairs of samples in input clusters and uses this score as a similarity measure to reorganize data into consensus clusters. Conclusion We applied ClustOmics to multi-omics disease subtyping on real TCGA cancer data from ten different cancer types. We showed that ClustOmics is robust to heterogeneous qualities of input partitions, smoothing and reconciling preliminary predictions into high-quality consensus clusters, both from a computational and a biological point of view. The comparison to a state-of-the-art consensus-based integration tool, COCA, further corroborated this statement. However, the main interest of ClustOmics is not to compete with other tools, but rather to make profit from their various predictions when no gold-standard metric is available to assess their significance. Availability The ClustOmics source code, released under MIT license, and the results obtained on TCGA cancer data are available on GitHub: https://github.com/galadrielbriere/ClustOmics .
topic Disease subtyping
Multi-omic data
Data integration
Consensus clustering
url https://doi.org/10.1186/s12859-021-04279-1
work_keys_str_mv AT galadrielbriere consensusclusteringappliedtomultiomicsdiseasesubtyping
AT elodiedarbo consensusclusteringappliedtomultiomicsdiseasesubtyping
AT patriciathebault consensusclusteringappliedtomultiomicsdiseasesubtyping
AT ralucauricaru consensusclusteringappliedtomultiomicsdiseasesubtyping
_version_ 1721309234896830464