Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction

It is a vital task to design an integrated machine learning model to discover cancer subtypes and understand the heterogeneity of cancer based on multiple omics data. In recent years, some multi-view clustering algorithms have been proposed and applied to the prediction of cancer subtypes. Among the...

Full description

Bibliographic Details
Main Authors: Jian Liu, Shuguang Ge, Yuhu Cheng, Xuesong Wang
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-09-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2021.718915/full
id doaj-3b75e53472eb4d699acf30d71c086c95
record_format Article
spelling doaj-3b75e53472eb4d699acf30d71c086c952021-09-06T05:18:53ZengFrontiers Media S.A.Frontiers in Genetics1664-80212021-09-011210.3389/fgene.2021.718915718915Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype PredictionJian Liu0Jian Liu1Shuguang Ge2Shuguang Ge3Yuhu Cheng4Yuhu Cheng5Xuesong Wang6Xuesong Wang7School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, ChinaEngineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou, ChinaEngineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou, ChinaEngineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou, ChinaSchool of Information and Control Engineering, China University of Mining and Technology, Xuzhou, ChinaEngineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou, ChinaIt is a vital task to design an integrated machine learning model to discover cancer subtypes and understand the heterogeneity of cancer based on multiple omics data. In recent years, some multi-view clustering algorithms have been proposed and applied to the prediction of cancer subtypes. Among them, the multi-view clustering methods based on graph learning are widely concerned. These multi-view approaches usually have one or more of the following problems. Many multi-view algorithms use the original omics data matrix to construct the similarity matrix and ignore the learning of the similarity matrix. They separate the data clustering process from the graph learning process, resulting in a highly dependent clustering performance on the predefined graph. In the process of graph fusion, these methods simply take the average value of the affinity graph of multiple views to represent the result of the fusion graph, and the rich heterogeneous information is not fully utilized. To solve the above problems, in this paper, a Multi-view Spectral Clustering Based on Multi-smooth Representation Fusion (MRF-MSC) method was proposed. Firstly, MRF-MSC constructs a smooth representation for each data type, which can be viewed as a sample (patient) similarity matrix. The smooth representation can explicitly enhance the grouping effect. Secondly, MRF-MSC integrates the smooth representation of multiple omics data to form a similarity matrix containing all biological data information through graph fusion. In addition, MRF-MSC adaptively gives weight factors to the smooth regularization representation of each omics data by using the self-weighting method. Finally, MRF-MSC imposes constrained Laplacian rank on the fusion similarity matrix to get a better cluster structure. The above problems can be transformed into spectral clustering for solving, and the clustering results can be obtained. MRF-MSC unifies the above process of graph construction, graph fusion and spectral clustering under one framework, which can learn better data representation and high-quality graphs, so as to achieve better clustering effect. In the experiment, MRF-MSC obtained good experimental results on the TCGA cancer data sets.https://www.frontiersin.org/articles/10.3389/fgene.2021.718915/fullmulti-view clusteringcancer subtypes predictionmulti-omics dataspectral clusteringsmooth representationgraph fusion
collection DOAJ
language English
format Article
sources DOAJ
author Jian Liu
Jian Liu
Shuguang Ge
Shuguang Ge
Yuhu Cheng
Yuhu Cheng
Xuesong Wang
Xuesong Wang
spellingShingle Jian Liu
Jian Liu
Shuguang Ge
Shuguang Ge
Yuhu Cheng
Yuhu Cheng
Xuesong Wang
Xuesong Wang
Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
Frontiers in Genetics
multi-view clustering
cancer subtypes prediction
multi-omics data
spectral clustering
smooth representation
graph fusion
author_facet Jian Liu
Jian Liu
Shuguang Ge
Shuguang Ge
Yuhu Cheng
Yuhu Cheng
Xuesong Wang
Xuesong Wang
author_sort Jian Liu
title Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_short Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_full Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_fullStr Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_full_unstemmed Multi-View Spectral Clustering Based on Multi-Smooth Representation Fusion for Cancer Subtype Prediction
title_sort multi-view spectral clustering based on multi-smooth representation fusion for cancer subtype prediction
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2021-09-01
description It is a vital task to design an integrated machine learning model to discover cancer subtypes and understand the heterogeneity of cancer based on multiple omics data. In recent years, some multi-view clustering algorithms have been proposed and applied to the prediction of cancer subtypes. Among them, the multi-view clustering methods based on graph learning are widely concerned. These multi-view approaches usually have one or more of the following problems. Many multi-view algorithms use the original omics data matrix to construct the similarity matrix and ignore the learning of the similarity matrix. They separate the data clustering process from the graph learning process, resulting in a highly dependent clustering performance on the predefined graph. In the process of graph fusion, these methods simply take the average value of the affinity graph of multiple views to represent the result of the fusion graph, and the rich heterogeneous information is not fully utilized. To solve the above problems, in this paper, a Multi-view Spectral Clustering Based on Multi-smooth Representation Fusion (MRF-MSC) method was proposed. Firstly, MRF-MSC constructs a smooth representation for each data type, which can be viewed as a sample (patient) similarity matrix. The smooth representation can explicitly enhance the grouping effect. Secondly, MRF-MSC integrates the smooth representation of multiple omics data to form a similarity matrix containing all biological data information through graph fusion. In addition, MRF-MSC adaptively gives weight factors to the smooth regularization representation of each omics data by using the self-weighting method. Finally, MRF-MSC imposes constrained Laplacian rank on the fusion similarity matrix to get a better cluster structure. The above problems can be transformed into spectral clustering for solving, and the clustering results can be obtained. MRF-MSC unifies the above process of graph construction, graph fusion and spectral clustering under one framework, which can learn better data representation and high-quality graphs, so as to achieve better clustering effect. In the experiment, MRF-MSC obtained good experimental results on the TCGA cancer data sets.
topic multi-view clustering
cancer subtypes prediction
multi-omics data
spectral clustering
smooth representation
graph fusion
url https://www.frontiersin.org/articles/10.3389/fgene.2021.718915/full
work_keys_str_mv AT jianliu multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT jianliu multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT shuguangge multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT shuguangge multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT yuhucheng multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT yuhucheng multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT xuesongwang multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
AT xuesongwang multiviewspectralclusteringbasedonmultismoothrepresentationfusionforcancersubtypeprediction
_version_ 1717780034390851584