Graph embedding with rich information through heterogeneous graph

Graph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most existing algorithms for graph embedding only rely on...

Full description

Bibliographic Details
Main Author: Sun, Guolei
Other Authors: Zhang, Xiangliang
Language:en
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10754/626207
http://repository.kaust.edu.sa/kaust/handle/10754/626207
id ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-626207
record_format oai_dc
spelling ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-6262072017-11-28T03:59:05Z Graph embedding with rich information through heterogeneous graph Sun, Guolei Zhang, Xiangliang Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division Gao, Xin Moshkov, Mikhail Graph embedding heterogeneous graph rich information random walk Graph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most existing algorithms for graph embedding only rely on the topology information and fail to use the copious information in nodes as well as edges. As a result, their performance for many tasks may not be satisfactory. In this thesis, we proposed a novel and general framework for graph embedding with rich text information (GERI) through constructing a heterogeneous network, in which we integrate node and edge content information with graph topology. Specially, we designed a novel biased random walk to explore the constructed heterogeneous network with the notion of flexible neighborhood. Our sampling strategy can compromise between BFS and DFS local search on heterogeneous graph. To further improve our algorithm, we proposed semi-supervised GERI (SGERI), which learns graph embedding in an discriminative manner through heterogeneous network with label information. The efficacy of our method is demonstrated by extensive comparison experiments with 9 baselines over multi-label and multi-class classification on various datasets including Citeseer, Cora, DBLP and Wiki. It shows that GERI improves the Micro-F1 and Macro-F1 of node classification up to 10%, and SGERI improves GERI by 5% in Wiki. 2017-11-12 Thesis http://hdl.handle.net/10754/626207 http://repository.kaust.edu.sa/kaust/handle/10754/626207 en
collection NDLTD
language en
sources NDLTD
topic Graph embedding
heterogeneous graph
rich information
random walk
spellingShingle Graph embedding
heterogeneous graph
rich information
random walk
Sun, Guolei
Graph embedding with rich information through heterogeneous graph
description Graph embedding, aiming to learn low-dimensional representations for nodes in graphs, has attracted increasing attention due to its critical application including node classification, link prediction and clustering in social network analysis. Most existing algorithms for graph embedding only rely on the topology information and fail to use the copious information in nodes as well as edges. As a result, their performance for many tasks may not be satisfactory. In this thesis, we proposed a novel and general framework for graph embedding with rich text information (GERI) through constructing a heterogeneous network, in which we integrate node and edge content information with graph topology. Specially, we designed a novel biased random walk to explore the constructed heterogeneous network with the notion of flexible neighborhood. Our sampling strategy can compromise between BFS and DFS local search on heterogeneous graph. To further improve our algorithm, we proposed semi-supervised GERI (SGERI), which learns graph embedding in an discriminative manner through heterogeneous network with label information. The efficacy of our method is demonstrated by extensive comparison experiments with 9 baselines over multi-label and multi-class classification on various datasets including Citeseer, Cora, DBLP and Wiki. It shows that GERI improves the Micro-F1 and Macro-F1 of node classification up to 10%, and SGERI improves GERI by 5% in Wiki.
author2 Zhang, Xiangliang
author_facet Zhang, Xiangliang
Sun, Guolei
author Sun, Guolei
author_sort Sun, Guolei
title Graph embedding with rich information through heterogeneous graph
title_short Graph embedding with rich information through heterogeneous graph
title_full Graph embedding with rich information through heterogeneous graph
title_fullStr Graph embedding with rich information through heterogeneous graph
title_full_unstemmed Graph embedding with rich information through heterogeneous graph
title_sort graph embedding with rich information through heterogeneous graph
publishDate 2017
url http://hdl.handle.net/10754/626207
http://repository.kaust.edu.sa/kaust/handle/10754/626207
work_keys_str_mv AT sunguolei graphembeddingwithrichinformationthroughheterogeneousgraph
_version_ 1718563134210310144