Graph-based exploration and clustering analysis of semantic spaces

Abstract The goal of this study is to demonstrate how network science and graph theory tools and concepts can be effectively used for exploring and comparing semantic spaces of word embeddings and lexical databases. Specifically, we construct semantic networks based on word2vec representation of wor...

Full description

Bibliographic Details
Main Authors:	Alexander Veremyev, Alexander Semenov, Eduardo L. Pasiliao, Vladimir Boginski
Format:	Article
Language:	English
Published:	SpringerOpen 2019-11-01
Series:	Applied Network Science
Subjects:	Semantic spaces Graph theory Word2vec similarity networks Cohesive clusters Cliques Clique relaxations
Online Access:	http://link.springer.com/article/10.1007/s41109-019-0228-y

id	doaj-777da7f1f5e64d5b93668dfe5de47832
record_format	Article
spelling	doaj-777da7f1f5e64d5b93668dfe5de478322020-11-25T04:00:55ZengSpringerOpenApplied Network Science2364-82282019-11-014112610.1007/s41109-019-0228-yGraph-based exploration and clustering analysis of semantic spacesAlexander Veremyev0Alexander Semenov1Eduardo L. Pasiliao2Vladimir Boginski3Department of Industrial Engineering and Management Systems, University of Central FloridaFaculty of Information Technology, University of JyväskyläAir Force Research LaboratoryDepartment of Industrial Engineering and Management Systems, University of Central FloridaAbstract The goal of this study is to demonstrate how network science and graph theory tools and concepts can be effectively used for exploring and comparing semantic spaces of word embeddings and lexical databases. Specifically, we construct semantic networks based on word2vec representation of words, which is “learnt” from large text corpora (Google news, Amazon reviews), and “human built” word networks derived from the well-known lexical databases: WordNet and Moby Thesaurus. We compare “global” (e.g., degrees, distances, clustering coefficients) and “local” (e.g., most central nodes and community-type dense clusters) characteristics of considered networks. Our observations suggest that human built networks possess more intuitive global connectivity patterns, whereas local characteristics (in particular, dense clusters) of the machine built networks provide much richer information on the contextual usage and perceived meanings of words, which reveals interesting structural differences between human built and machine built semantic networks. To our knowledge, this is the first study that uses graph theory and network science in the considered context; therefore, we also provide interesting examples and discuss potential research directions that may motivate further research on the synthesis of lexicographic and machine learning based tools and lead to new insights in this area.http://link.springer.com/article/10.1007/s41109-019-0228-ySemantic spacesGraph theoryWord2vec similarity networksCohesive clustersCliquesClique relaxations
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Alexander Veremyev Alexander Semenov Eduardo L. Pasiliao Vladimir Boginski
spellingShingle	Alexander Veremyev Alexander Semenov Eduardo L. Pasiliao Vladimir Boginski Graph-based exploration and clustering analysis of semantic spaces Applied Network Science Semantic spaces Graph theory Word2vec similarity networks Cohesive clusters Cliques Clique relaxations
author_facet	Alexander Veremyev Alexander Semenov Eduardo L. Pasiliao Vladimir Boginski
author_sort	Alexander Veremyev
title	Graph-based exploration and clustering analysis of semantic spaces
title_short	Graph-based exploration and clustering analysis of semantic spaces
title_full	Graph-based exploration and clustering analysis of semantic spaces
title_fullStr	Graph-based exploration and clustering analysis of semantic spaces
title_full_unstemmed	Graph-based exploration and clustering analysis of semantic spaces
title_sort	graph-based exploration and clustering analysis of semantic spaces
publisher	SpringerOpen
series	Applied Network Science
issn	2364-8228
publishDate	2019-11-01
description	Abstract The goal of this study is to demonstrate how network science and graph theory tools and concepts can be effectively used for exploring and comparing semantic spaces of word embeddings and lexical databases. Specifically, we construct semantic networks based on word2vec representation of words, which is “learnt” from large text corpora (Google news, Amazon reviews), and “human built” word networks derived from the well-known lexical databases: WordNet and Moby Thesaurus. We compare “global” (e.g., degrees, distances, clustering coefficients) and “local” (e.g., most central nodes and community-type dense clusters) characteristics of considered networks. Our observations suggest that human built networks possess more intuitive global connectivity patterns, whereas local characteristics (in particular, dense clusters) of the machine built networks provide much richer information on the contextual usage and perceived meanings of words, which reveals interesting structural differences between human built and machine built semantic networks. To our knowledge, this is the first study that uses graph theory and network science in the considered context; therefore, we also provide interesting examples and discuss potential research directions that may motivate further research on the synthesis of lexicographic and machine learning based tools and lead to new insights in this area.
topic	Semantic spaces Graph theory Word2vec similarity networks Cohesive clusters Cliques Clique relaxations
url	http://link.springer.com/article/10.1007/s41109-019-0228-y
work_keys_str_mv	AT alexanderveremyev graphbasedexplorationandclusteringanalysisofsemanticspaces AT alexandersemenov graphbasedexplorationandclusteringanalysisofsemanticspaces AT eduardolpasiliao graphbasedexplorationandclusteringanalysisofsemanticspaces AT vladimirboginski graphbasedexplorationandclusteringanalysisofsemanticspaces
_version_	1724448447198658560

Graph-based exploration and clustering analysis of semantic spaces

Similar Items