Normalized ImQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers
In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network stud...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2014-01-01
|
Series: | Cancer Informatics |
Online Access: | https://doi.org/10.4137/CIN.S14021 |
id |
doaj-5250e25e8aaf4e50b5ca1dae3ef0152b |
---|---|
record_format |
Article |
spelling |
doaj-5250e25e8aaf4e50b5ca1dae3ef0152b2020-11-25T03:09:34ZengSAGE PublishingCancer Informatics1176-93512014-01-0113s310.4137/CIN.S14021Normalized ImQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in CancersJie Zhang0Kun Huang1Department of Biomedical Informatics and Biomedical Informatics Shared Resource, The Ohio State University, Columbus, USA.Department of Biomedical Informatics and Biomedical Informatics Shared Resource, The Ohio State University, Columbus, USA.In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network study and biomedicine. Our approach has two major improvements upon previous work. The first is the use of local maximum edges to initialize the search in order to avoid excessive overlaps among the modules, thereby greatly reducing the computing time. The second is the inclusion of a weight normalization procedure to enable discovery of “subtle” modules with more balanced sizes. We carried out careful tests on multiple parameters and settings using two large cancer datasets. This approach allowed us to identify a large number of gene modules enriched in both biological functions and chromosomal bands in cancer data, suggesting potential roles of copy number variations (CNVs) involved in the cancer development. We then tested the genes in selected modules with enriched chromosomal bands using The Cancer Genome Atlas data, and the results strongly support our hypothesis that the coexpression in these modules are associated with CN Vs. While gene coexpression network analyses have been widely adopted in disease studies, most of them focus on the functional relationships of coexpressed genes. The relationship between coexpression gene modules and CNVs are much less investigated despite the potential advantage that we can infer from such relationship without genotyping data. Our new approach thus provides a means to carry out deep mining of the gene coexpression network to obtain both functional and genetic information from the expression data.https://doi.org/10.4137/CIN.S14021 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Jie Zhang Kun Huang |
spellingShingle |
Jie Zhang Kun Huang Normalized ImQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers Cancer Informatics |
author_facet |
Jie Zhang Kun Huang |
author_sort |
Jie Zhang |
title |
Normalized ImQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers |
title_short |
Normalized ImQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers |
title_full |
Normalized ImQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers |
title_fullStr |
Normalized ImQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers |
title_full_unstemmed |
Normalized ImQCM: An Algorithm for Detecting Weak Quasi-Cliques in Weighted Graph with Applications in Gene Co-Expression Module Discovery in Cancers |
title_sort |
normalized imqcm: an algorithm for detecting weak quasi-cliques in weighted graph with applications in gene co-expression module discovery in cancers |
publisher |
SAGE Publishing |
series |
Cancer Informatics |
issn |
1176-9351 |
publishDate |
2014-01-01 |
description |
In this paper, we present a new approach for mining weighted networks to identify densely connected modules such as quasi-cliques. Quasi-cliques are densely connected subnetworks in a network. Detecting quasi-cliques is an important topic in data mining, with applications such as social network study and biomedicine. Our approach has two major improvements upon previous work. The first is the use of local maximum edges to initialize the search in order to avoid excessive overlaps among the modules, thereby greatly reducing the computing time. The second is the inclusion of a weight normalization procedure to enable discovery of “subtle” modules with more balanced sizes. We carried out careful tests on multiple parameters and settings using two large cancer datasets. This approach allowed us to identify a large number of gene modules enriched in both biological functions and chromosomal bands in cancer data, suggesting potential roles of copy number variations (CNVs) involved in the cancer development. We then tested the genes in selected modules with enriched chromosomal bands using The Cancer Genome Atlas data, and the results strongly support our hypothesis that the coexpression in these modules are associated with CN Vs. While gene coexpression network analyses have been widely adopted in disease studies, most of them focus on the functional relationships of coexpressed genes. The relationship between coexpression gene modules and CNVs are much less investigated despite the potential advantage that we can infer from such relationship without genotyping data. Our new approach thus provides a means to carry out deep mining of the gene coexpression network to obtain both functional and genetic information from the expression data. |
url |
https://doi.org/10.4137/CIN.S14021 |
work_keys_str_mv |
AT jiezhang normalizedimqcmanalgorithmfordetectingweakquasicliquesinweightedgraphwithapplicationsingenecoexpressionmodulediscoveryincancers AT kunhuang normalizedimqcmanalgorithmfordetectingweakquasicliquesinweightedgraphwithapplicationsingenecoexpressionmodulediscoveryincancers |
_version_ |
1724661888770375680 |