Voting-based cancer module identification by combining topological and data-driven properties.

Recently, computational approaches integrating copy number aberrations (CNAs) and gene expression (GE) have been extensively studied to identify cancer-related genes and pathways. In this work, we integrate these two data sets with protein-protein interaction (PPI) information to find cancer-related...

Full description

Bibliographic Details
Main Authors: A K M Azad, Hyunju Lee
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3734239?pdf=render
id doaj-9c3c70c99aa6418f9d9975407f59ab06
record_format Article
spelling doaj-9c3c70c99aa6418f9d9975407f59ab062020-11-25T01:23:07ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0188e7049810.1371/journal.pone.0070498Voting-based cancer module identification by combining topological and data-driven properties.A K M AzadHyunju LeeRecently, computational approaches integrating copy number aberrations (CNAs) and gene expression (GE) have been extensively studied to identify cancer-related genes and pathways. In this work, we integrate these two data sets with protein-protein interaction (PPI) information to find cancer-related functional modules. To integrate CNA and GE data, we first built a gene-gene relationship network from a set of seed genes by enumerating all types of pairwise correlations, e.g. GE-GE, CNA-GE, and CNA-CNA, over multiple patients. Next, we propose a voting-based cancer module identification algorithm by combining topological and data-driven properties (VToD algorithm) by using the gene-gene relationship network as a source of data-driven information, and the PPI data as topological information. We applied the VToD algorithm to 266 glioblastoma multiforme (GBM) and 96 ovarian carcinoma (OVC) samples that have both expression and copy number measurements, and identified 22 GBM modules and 23 OVC modules. Among 22 GBM modules, 15, 12, and 20 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Among 23 OVC modules, 19, 18, and 23 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Similarly, we also observed that 9 and 2 GBM modules and 15 and 18 OVC modules were enriched with cancer gene census (CGC) and specific cancer driver genes, respectively. Our proposed module-detection algorithm significantly outperformed other existing methods in terms of both functional and cancer gene set enrichments. Most of the cancer-related pathways from both cancer data sets found in our algorithm contained more than two types of gene-gene relationships, showing strong positive correlations between the number of different types of relationship and CGC enrichment [Formula: see text]-values (0.64 for GBM and 0.49 for OVC). This study suggests that identified modules containing both expression changes and CNAs can explain cancer-related activities with greater insights.http://europepmc.org/articles/PMC3734239?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author A K M Azad
Hyunju Lee
spellingShingle A K M Azad
Hyunju Lee
Voting-based cancer module identification by combining topological and data-driven properties.
PLoS ONE
author_facet A K M Azad
Hyunju Lee
author_sort A K M Azad
title Voting-based cancer module identification by combining topological and data-driven properties.
title_short Voting-based cancer module identification by combining topological and data-driven properties.
title_full Voting-based cancer module identification by combining topological and data-driven properties.
title_fullStr Voting-based cancer module identification by combining topological and data-driven properties.
title_full_unstemmed Voting-based cancer module identification by combining topological and data-driven properties.
title_sort voting-based cancer module identification by combining topological and data-driven properties.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2013-01-01
description Recently, computational approaches integrating copy number aberrations (CNAs) and gene expression (GE) have been extensively studied to identify cancer-related genes and pathways. In this work, we integrate these two data sets with protein-protein interaction (PPI) information to find cancer-related functional modules. To integrate CNA and GE data, we first built a gene-gene relationship network from a set of seed genes by enumerating all types of pairwise correlations, e.g. GE-GE, CNA-GE, and CNA-CNA, over multiple patients. Next, we propose a voting-based cancer module identification algorithm by combining topological and data-driven properties (VToD algorithm) by using the gene-gene relationship network as a source of data-driven information, and the PPI data as topological information. We applied the VToD algorithm to 266 glioblastoma multiforme (GBM) and 96 ovarian carcinoma (OVC) samples that have both expression and copy number measurements, and identified 22 GBM modules and 23 OVC modules. Among 22 GBM modules, 15, 12, and 20 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Among 23 OVC modules, 19, 18, and 23 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Similarly, we also observed that 9 and 2 GBM modules and 15 and 18 OVC modules were enriched with cancer gene census (CGC) and specific cancer driver genes, respectively. Our proposed module-detection algorithm significantly outperformed other existing methods in terms of both functional and cancer gene set enrichments. Most of the cancer-related pathways from both cancer data sets found in our algorithm contained more than two types of gene-gene relationships, showing strong positive correlations between the number of different types of relationship and CGC enrichment [Formula: see text]-values (0.64 for GBM and 0.49 for OVC). This study suggests that identified modules containing both expression changes and CNAs can explain cancer-related activities with greater insights.
url http://europepmc.org/articles/PMC3734239?pdf=render
work_keys_str_mv AT akmazad votingbasedcancermoduleidentificationbycombiningtopologicalanddatadrivenproperties
AT hyunjulee votingbasedcancermoduleidentificationbycombiningtopologicalanddatadrivenproperties
_version_ 1725123369994551296