Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human Interactome

The study of disease-relevant gene modules is one of the main methods to discover disease pathway and potential drug targets. Recent studies have found that most disease proteins tend to form many separate connected components and scatter across the protein-protein interaction network. However, most...

Full description

Bibliographic Details
Main Authors: Tao Wang, Qidi Peng, Bo Liu, Yongzhuang Liu, Yadong Wang
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-05-01
Series:Frontiers in Bioengineering and Biotechnology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fbioe.2020.00418/full
id doaj-067fc769d98346ebb455760e4bd04641
record_format Article
spelling doaj-067fc769d98346ebb455760e4bd046412020-11-25T02:14:04ZengFrontiers Media S.A.Frontiers in Bioengineering and Biotechnology2296-41852020-05-01810.3389/fbioe.2020.00418540758Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human InteractomeTao WangQidi PengBo LiuYongzhuang LiuYadong WangThe study of disease-relevant gene modules is one of the main methods to discover disease pathway and potential drug targets. Recent studies have found that most disease proteins tend to form many separate connected components and scatter across the protein-protein interaction network. However, most of the research on discovering disease modules are biased toward well-studied seed genes, which tend to extend seed genes into a single connected subnetwork. In this paper, we propose N2V-HC, an algorithm framework aiming to unbiasedly discover the scattered disease modules based on deep representation learning of integrated multi-layer biological networks. Our method first predicts disease associated genes based on summary data of Genome-wide Association Studies (GWAS) and expression Quantitative Trait Loci (eQTL) studies, and generates an integrated network on the basis of human interactome. The features of nodes in the network are then extracted by deep representation learning. Hierarchical clustering with dynamic tree cut methods are applied to discover the modules that are enriched with disease associated genes. The evaluation on real networks and simulated networks show that N2V-HC performs better than existing methods in network module discovery. Case studies on Parkinson's disease and Alzheimer's disease, show that N2V-HC can be used to discover biological meaningful modules related to the pathways underlying complex diseases.https://www.frontiersin.org/article/10.3389/fbioe.2020.00418/fulldisease module identificationGWASeQTLnode2vechierarchical clustering
collection DOAJ
language English
format Article
sources DOAJ
author Tao Wang
Qidi Peng
Bo Liu
Yongzhuang Liu
Yadong Wang
spellingShingle Tao Wang
Qidi Peng
Bo Liu
Yongzhuang Liu
Yadong Wang
Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human Interactome
Frontiers in Bioengineering and Biotechnology
disease module identification
GWAS
eQTL
node2vec
hierarchical clustering
author_facet Tao Wang
Qidi Peng
Bo Liu
Yongzhuang Liu
Yadong Wang
author_sort Tao Wang
title Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human Interactome
title_short Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human Interactome
title_full Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human Interactome
title_fullStr Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human Interactome
title_full_unstemmed Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human Interactome
title_sort disease module identification based on representation learning of complex networks integrated from gwas, eqtl summaries, and human interactome
publisher Frontiers Media S.A.
series Frontiers in Bioengineering and Biotechnology
issn 2296-4185
publishDate 2020-05-01
description The study of disease-relevant gene modules is one of the main methods to discover disease pathway and potential drug targets. Recent studies have found that most disease proteins tend to form many separate connected components and scatter across the protein-protein interaction network. However, most of the research on discovering disease modules are biased toward well-studied seed genes, which tend to extend seed genes into a single connected subnetwork. In this paper, we propose N2V-HC, an algorithm framework aiming to unbiasedly discover the scattered disease modules based on deep representation learning of integrated multi-layer biological networks. Our method first predicts disease associated genes based on summary data of Genome-wide Association Studies (GWAS) and expression Quantitative Trait Loci (eQTL) studies, and generates an integrated network on the basis of human interactome. The features of nodes in the network are then extracted by deep representation learning. Hierarchical clustering with dynamic tree cut methods are applied to discover the modules that are enriched with disease associated genes. The evaluation on real networks and simulated networks show that N2V-HC performs better than existing methods in network module discovery. Case studies on Parkinson's disease and Alzheimer's disease, show that N2V-HC can be used to discover biological meaningful modules related to the pathways underlying complex diseases.
topic disease module identification
GWAS
eQTL
node2vec
hierarchical clustering
url https://www.frontiersin.org/article/10.3389/fbioe.2020.00418/full
work_keys_str_mv AT taowang diseasemoduleidentificationbasedonrepresentationlearningofcomplexnetworksintegratedfromgwaseqtlsummariesandhumaninteractome
AT qidipeng diseasemoduleidentificationbasedonrepresentationlearningofcomplexnetworksintegratedfromgwaseqtlsummariesandhumaninteractome
AT boliu diseasemoduleidentificationbasedonrepresentationlearningofcomplexnetworksintegratedfromgwaseqtlsummariesandhumaninteractome
AT yongzhuangliu diseasemoduleidentificationbasedonrepresentationlearningofcomplexnetworksintegratedfromgwaseqtlsummariesandhumaninteractome
AT yadongwang diseasemoduleidentificationbasedonrepresentationlearningofcomplexnetworksintegratedfromgwaseqtlsummariesandhumaninteractome
_version_ 1724902188816269312