Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data
Low-Rank Representation (LRR) is a powerful subspace clustering method because of its successful learning of low-dimensional subspace of data. With the breakthrough of “OMics” technology, many LRR-based methods have been proposed and used to cancer clustering based on gene expression data. Moreover,...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi-Wiley
2020-01-01
|
Series: | Complexity |
Online Access: | http://dx.doi.org/10.1155/2020/4865738 |
id |
doaj-89569e2dca004293a6812acf781992a8 |
---|---|
record_format |
Article |
spelling |
doaj-89569e2dca004293a6812acf781992a82020-11-25T01:19:32ZengHindawi-WileyComplexity1076-27871099-05262020-01-01202010.1155/2020/48657384865738Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA DataJuan Wang0Jin-Xing Liu1Chun-Hou Zheng2Cong-Hai Lu3Ling-Yun Dai4Xiang-Zhen Kong5School of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaSchool of Software Engineering, Qufu Normal University, Qufu, Shandong 273165, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaSchool of Information Science and Engineering, Qufu Normal University, Rizhao, Shandong 276826, ChinaLow-Rank Representation (LRR) is a powerful subspace clustering method because of its successful learning of low-dimensional subspace of data. With the breakthrough of “OMics” technology, many LRR-based methods have been proposed and used to cancer clustering based on gene expression data. Moreover, studies have shown that besides gene expression data, some other genomic data in TCGA also contain important information for cancer research. Therefore, these genomic data can be integrated as a comprehensive feature source for cancer clustering. How to establish an effective clustering model for comprehensive analysis of integrated TCGA data has become a key issue. In this paper, we develop the traditional LRR method and propose a novel method named Block-constraint Laplacian-Regularized Low-Rank Representation (BLLRR) to model multigenome data for cancer sample clustering. The proposed method is dedicated to extracting more abundant subspace structure information from multiple genomic data to improve the accuracy of cancer sample clustering. Considering the heterogeneity of different genome data, we introduce the block-constraint idea into our method. In BLLRR decomposition, we treat each genome data as a data block and impose different constraints on different data blocks. In addition, graph Laplacian is also introduced into our method to better learn the topological structure of data by preserving the local geometric information. The experiments demonstrate that the BLLRR method can effectively analyze integrated TCGA data and extract more subspace structure information from multigenome data. It is a reliable and efficient clustering algorithm for cancer sample clustering.http://dx.doi.org/10.1155/2020/4865738 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Juan Wang Jin-Xing Liu Chun-Hou Zheng Cong-Hai Lu Ling-Yun Dai Xiang-Zhen Kong |
spellingShingle |
Juan Wang Jin-Xing Liu Chun-Hou Zheng Cong-Hai Lu Ling-Yun Dai Xiang-Zhen Kong Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data Complexity |
author_facet |
Juan Wang Jin-Xing Liu Chun-Hou Zheng Cong-Hai Lu Ling-Yun Dai Xiang-Zhen Kong |
author_sort |
Juan Wang |
title |
Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data |
title_short |
Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data |
title_full |
Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data |
title_fullStr |
Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data |
title_full_unstemmed |
Block-Constraint Laplacian-Regularized Low-Rank Representation and Its Application for Cancer Sample Clustering Based on Integrated TCGA Data |
title_sort |
block-constraint laplacian-regularized low-rank representation and its application for cancer sample clustering based on integrated tcga data |
publisher |
Hindawi-Wiley |
series |
Complexity |
issn |
1076-2787 1099-0526 |
publishDate |
2020-01-01 |
description |
Low-Rank Representation (LRR) is a powerful subspace clustering method because of its successful learning of low-dimensional subspace of data. With the breakthrough of “OMics” technology, many LRR-based methods have been proposed and used to cancer clustering based on gene expression data. Moreover, studies have shown that besides gene expression data, some other genomic data in TCGA also contain important information for cancer research. Therefore, these genomic data can be integrated as a comprehensive feature source for cancer clustering. How to establish an effective clustering model for comprehensive analysis of integrated TCGA data has become a key issue. In this paper, we develop the traditional LRR method and propose a novel method named Block-constraint Laplacian-Regularized Low-Rank Representation (BLLRR) to model multigenome data for cancer sample clustering. The proposed method is dedicated to extracting more abundant subspace structure information from multiple genomic data to improve the accuracy of cancer sample clustering. Considering the heterogeneity of different genome data, we introduce the block-constraint idea into our method. In BLLRR decomposition, we treat each genome data as a data block and impose different constraints on different data blocks. In addition, graph Laplacian is also introduced into our method to better learn the topological structure of data by preserving the local geometric information. The experiments demonstrate that the BLLRR method can effectively analyze integrated TCGA data and extract more subspace structure information from multigenome data. It is a reliable and efficient clustering algorithm for cancer sample clustering. |
url |
http://dx.doi.org/10.1155/2020/4865738 |
work_keys_str_mv |
AT juanwang blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata AT jinxingliu blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata AT chunhouzheng blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata AT conghailu blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata AT lingyundai blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata AT xiangzhenkong blockconstraintlaplacianregularizedlowrankrepresentationanditsapplicationforcancersampleclusteringbasedonintegratedtcgadata |
_version_ |
1715798051242639360 |