TWE‐WSD: An effective topical word embedding based word sense disambiguation

Abstract Word embedding has been widely used in word sense disambiguation (WSD) and many other tasks in recent years for it can well represent the semantics of words. However, the existing word embedding methods mostly represent each word as a single vector, without considering the homonymy and poly...

Full description

Bibliographic Details
Main Authors: Lianyin Jia, Jilin Tang, Mengjuan Li, Jinguo You, Jiaman Ding, Yinong Chen
Format: Article
Language:English
Published: Wiley 2021-03-01
Series:CAAI Transactions on Intelligence Technology
Online Access:https://doi.org/10.1049/cit2.12006
id doaj-e1c5eb3fa41a4b1dbaa91177803f3f43
record_format Article
spelling doaj-e1c5eb3fa41a4b1dbaa91177803f3f432021-04-20T13:35:04ZengWileyCAAI Transactions on Intelligence Technology2468-23222021-03-0161727910.1049/cit2.12006TWE‐WSD: An effective topical word embedding based word sense disambiguationLianyin Jia0Jilin Tang1Mengjuan Li2Jinguo You3Jiaman Ding4Yinong Chen5Faculty of Information Engineering and Automation Kunming University of Science and Technology Kunming ChinaFaculty of Information Engineering and Automation Kunming University of Science and Technology Kunming ChinaLibrary Yunnan Normal University Kunming ChinaFaculty of Information Engineering and Automation Kunming University of Science and Technology Kunming ChinaFaculty of Information Engineering and Automation Kunming University of Science and Technology Kunming ChinaSchool of Computing, Informatics, and Decision Systems Arizona State University Tempe Arizona USAAbstract Word embedding has been widely used in word sense disambiguation (WSD) and many other tasks in recent years for it can well represent the semantics of words. However, the existing word embedding methods mostly represent each word as a single vector, without considering the homonymy and polysemy of the word; thus, their performances are limited. In order to address this problem, an effective topical word embedding (TWE)‐based WSD method, named TWE‐WSD, is proposed, which integrates Latent Dirichlet Allocation (LDA) and word embedding. Instead of generating a single word vector (WV) for each word, TWE‐WSD generates a topical WV for each word under each topic. Effective integrating strategies are designed to obtain high quality contextual vectors. Extensive experiments on SemEval‐2013 and SemEval‐2015 for English all‐words tasks showed that TWE‐WSD outperforms other state‐of‐the‐art WSD methods, especially on nouns.https://doi.org/10.1049/cit2.12006
collection DOAJ
language English
format Article
sources DOAJ
author Lianyin Jia
Jilin Tang
Mengjuan Li
Jinguo You
Jiaman Ding
Yinong Chen
spellingShingle Lianyin Jia
Jilin Tang
Mengjuan Li
Jinguo You
Jiaman Ding
Yinong Chen
TWE‐WSD: An effective topical word embedding based word sense disambiguation
CAAI Transactions on Intelligence Technology
author_facet Lianyin Jia
Jilin Tang
Mengjuan Li
Jinguo You
Jiaman Ding
Yinong Chen
author_sort Lianyin Jia
title TWE‐WSD: An effective topical word embedding based word sense disambiguation
title_short TWE‐WSD: An effective topical word embedding based word sense disambiguation
title_full TWE‐WSD: An effective topical word embedding based word sense disambiguation
title_fullStr TWE‐WSD: An effective topical word embedding based word sense disambiguation
title_full_unstemmed TWE‐WSD: An effective topical word embedding based word sense disambiguation
title_sort twe‐wsd: an effective topical word embedding based word sense disambiguation
publisher Wiley
series CAAI Transactions on Intelligence Technology
issn 2468-2322
publishDate 2021-03-01
description Abstract Word embedding has been widely used in word sense disambiguation (WSD) and many other tasks in recent years for it can well represent the semantics of words. However, the existing word embedding methods mostly represent each word as a single vector, without considering the homonymy and polysemy of the word; thus, their performances are limited. In order to address this problem, an effective topical word embedding (TWE)‐based WSD method, named TWE‐WSD, is proposed, which integrates Latent Dirichlet Allocation (LDA) and word embedding. Instead of generating a single word vector (WV) for each word, TWE‐WSD generates a topical WV for each word under each topic. Effective integrating strategies are designed to obtain high quality contextual vectors. Extensive experiments on SemEval‐2013 and SemEval‐2015 for English all‐words tasks showed that TWE‐WSD outperforms other state‐of‐the‐art WSD methods, especially on nouns.
url https://doi.org/10.1049/cit2.12006
work_keys_str_mv AT lianyinjia twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation
AT jilintang twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation
AT mengjuanli twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation
AT jinguoyou twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation
AT jiamanding twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation
AT yinongchen twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation
_version_ 1721517779628064768