Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data

Low-Rank Representation (LRR) is a powerful subspace clustering method because of its successful learning of low-dimensional subspace of data. With the breakthrough of “OMics” technology, many LRR-based methods have been proposed and used to cancer clustering based on gene expression data. Moreover,...

Full description

Bibliographic Details
Main Authors: Juan Wang, Jin-Xing Liu, Chun-Hou Zheng, Cong-Hai Lu, Ling-Yun Dai, Xiang-Zhen Kong
Format: Article
Language:English
Published: Hindawi-Wiley 2020-01-01
Series:Complexity
Online Access:http://dx.doi.org/10.1155/2020/4865738
id doaj-89569e2dca004293a6812acf781992a8
record_format Article
spelling doaj-89569e2dca004293a6812acf781992a82020-11-25T01:19:32ZengHindawi-WileyComplexity1076-27871099-05262020-01-01202010.1155/2020/48657384865738Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA DataJuan Wang0Jin-Xing Liu1Chun-Hou Zheng2Cong-Hai Lu3Ling-Yun Dai4Xiang-Zhen Kong5School of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaSchool of Software Engineering, Qufu Normal University, Qufu, Shandong 273165, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaLow-Rank Representation (LRR) is a powerful subspace clustering method because of its successful learning of low-dimensional subspace of data. With the breakthrough of “OMics” technology, many LRR-based methods have been proposed and used to cancer clustering based on gene expression data. Moreover, studies have shown that besides gene expression data, some other genomic data in TCGA also contain important information for cancer research. Therefore, these genomic data can be integrated as a comprehensive feature source for cancer clustering. How to establish an effective clustering model for comprehensive analysis of integrated TCGA data has become a key issue. In this paper, we develop the traditional LRR method and propose a novel method named Block-constraint Laplacian-Regularized Low-Rank Representation (BLLRR) to model multigenome data for cancer sample clustering. The proposed method is dedicated to extracting more abundant subspace structure information from multiple genomic data to improve the accuracy of cancer sample clustering. Considering the heterogeneity of different genome data, we introduce the block-constraint idea into our method. In BLLRR decomposition, we treat each genome data as a data block and impose different constraints on different data blocks. In addition, graph Laplacian is also introduced into our method to better learn the topological structure of data by preserving the local geometric information. The experiments demonstrate that the BLLRR method can effectively analyze integrated TCGA data and extract more subspace structure information from multigenome data. It is a reliable and efficient clustering algorithm for cancer sample clustering.http://dx.doi.org/10.1155/2020/4865738
collection DOAJ
language English
format Article
sources DOAJ
author Juan Wang
Jin-Xing Liu
Chun-Hou Zheng
Cong-Hai Lu
Ling-Yun Dai
Xiang-Zhen Kong
spellingShingle Juan Wang
Jin-Xing Liu
Chun-Hou Zheng
Cong-Hai Lu
Ling-Yun Dai
Xiang-Zhen Kong
Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data
Complexity
author_facet Juan Wang
Jin-Xing Liu
Chun-Hou Zheng
Cong-Hai Lu
Ling-Yun Dai
Xiang-Zhen Kong
author_sort Juan Wang
title Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data
title_short Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data
title_full Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data
title_fullStr Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data
title_full_unstemmed Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data
title_sort block-constraint laplacian-regularized low-rank representation and its application for cancer sample clustering based on integrated tcga data
publisher Hindawi-Wiley
series Complexity
issn 1076-2787
1099-0526
publishDate 2020-01-01
description Low-Rank Representation (LRR) is a powerful subspace clustering method because of its successful learning of low-dimensional subspace of data. With the breakthrough of “OMics” technology, many LRR-based methods have been proposed and used to cancer clustering based on gene expression data. Moreover, studies have shown that besides gene expression data, some other genomic data in TCGA also contain important information for cancer research. Therefore, these genomic data can be integrated as a comprehensive feature source for cancer clustering. How to establish an effective clustering model for comprehensive analysis of integrated TCGA data has become a key issue. In this paper, we develop the traditional LRR method and propose a novel method named Block-constraint Laplacian-Regularized Low-Rank Representation (BLLRR) to model multigenome data for cancer sample clustering. The proposed method is dedicated to extracting more abundant subspace structure information from multiple genomic data to improve the accuracy of cancer sample clustering. Considering the heterogeneity of different genome data, we introduce the block-constraint idea into our method. In BLLRR decomposition, we treat each genome data as a data block and impose different constraints on different data blocks. In addition, graph Laplacian is also introduced into our method to better learn the topological structure of data by preserving the local geometric information. The experiments demonstrate that the BLLRR method can effectively analyze integrated TCGA data and extract more subspace structure information from multigenome data. It is a reliable and efficient clustering algorithm for cancer sample clustering.
url http://dx.doi.org/10.1155/2020/4865738
work_keys_str_mv AT juanwang blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata
AT jinxingliu blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata
AT chunhouzheng blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata
AT conghailu blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata
AT lingyundai blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata
AT xiangzhenkong blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata
_version_ 1715798051242639360