TWE‐WSD: An effective topical word embedding based word sense disambiguation
Abstract Word embedding has been widely used in word sense disambiguation (WSD) and many other tasks in recent years for it can well represent the semantics of words. However, the existing word embedding methods mostly represent each word as a single vector, without considering the homonymy and poly...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2021-03-01
|
Series: | CAAI Transactions on Intelligence Technology |
Online Access: | https://doi.org/10.1049/cit2.12006 |
id |
doaj-e1c5eb3fa41a4b1dbaa91177803f3f43 |
---|---|
record_format |
Article |
spelling |
doaj-e1c5eb3fa41a4b1dbaa91177803f3f432021-04-20T13:35:04ZengWileyCAAI Transactions on Intelligence Technology2468-23222021-03-0161727910.1049/cit2.12006TWE‐WSD: An effective topical word embedding based word sense disambiguationLianyin Jia0Jilin Tang1Mengjuan Li2Jinguo You3Jiaman Ding4Yinong Chen5Faculty of Information Engineering and Automation Kunming University of Science and Technology Kunming ChinaFaculty of Information Engineering and Automation Kunming University of Science and Technology Kunming ChinaLibrary Yunnan Normal University Kunming ChinaFaculty of Information Engineering and Automation Kunming University of Science and Technology Kunming ChinaFaculty of Information Engineering and Automation Kunming University of Science and Technology Kunming ChinaSchool of Computing, Informatics, and Decision Systems Arizona State University Tempe Arizona USAAbstract Word embedding has been widely used in word sense disambiguation (WSD) and many other tasks in recent years for it can well represent the semantics of words. However, the existing word embedding methods mostly represent each word as a single vector, without considering the homonymy and polysemy of the word; thus, their performances are limited. In order to address this problem, an effective topical word embedding (TWE)‐based WSD method, named TWE‐WSD, is proposed, which integrates Latent Dirichlet Allocation (LDA) and word embedding. Instead of generating a single word vector (WV) for each word, TWE‐WSD generates a topical WV for each word under each topic. Effective integrating strategies are designed to obtain high quality contextual vectors. Extensive experiments on SemEval‐2013 and SemEval‐2015 for English all‐words tasks showed that TWE‐WSD outperforms other state‐of‐the‐art WSD methods, especially on nouns.https://doi.org/10.1049/cit2.12006 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Lianyin Jia Jilin Tang Mengjuan Li Jinguo You Jiaman Ding Yinong Chen |
spellingShingle |
Lianyin Jia Jilin Tang Mengjuan Li Jinguo You Jiaman Ding Yinong Chen TWE‐WSD: An effective topical word embedding based word sense disambiguation CAAI Transactions on Intelligence Technology |
author_facet |
Lianyin Jia Jilin Tang Mengjuan Li Jinguo You Jiaman Ding Yinong Chen |
author_sort |
Lianyin Jia |
title |
TWE‐WSD: An effective topical word embedding based word sense disambiguation |
title_short |
TWE‐WSD: An effective topical word embedding based word sense disambiguation |
title_full |
TWE‐WSD: An effective topical word embedding based word sense disambiguation |
title_fullStr |
TWE‐WSD: An effective topical word embedding based word sense disambiguation |
title_full_unstemmed |
TWE‐WSD: An effective topical word embedding based word sense disambiguation |
title_sort |
twe‐wsd: an effective topical word embedding based word sense disambiguation |
publisher |
Wiley |
series |
CAAI Transactions on Intelligence Technology |
issn |
2468-2322 |
publishDate |
2021-03-01 |
description |
Abstract Word embedding has been widely used in word sense disambiguation (WSD) and many other tasks in recent years for it can well represent the semantics of words. However, the existing word embedding methods mostly represent each word as a single vector, without considering the homonymy and polysemy of the word; thus, their performances are limited. In order to address this problem, an effective topical word embedding (TWE)‐based WSD method, named TWE‐WSD, is proposed, which integrates Latent Dirichlet Allocation (LDA) and word embedding. Instead of generating a single word vector (WV) for each word, TWE‐WSD generates a topical WV for each word under each topic. Effective integrating strategies are designed to obtain high quality contextual vectors. Extensive experiments on SemEval‐2013 and SemEval‐2015 for English all‐words tasks showed that TWE‐WSD outperforms other state‐of‐the‐art WSD methods, especially on nouns. |
url |
https://doi.org/10.1049/cit2.12006 |
work_keys_str_mv |
AT lianyinjia twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation AT jilintang twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation AT mengjuanli twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation AT jinguoyou twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation AT jiamanding twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation AT yinongchen twewsdaneffectivetopicalwordembeddingbasedwordsensedisambiguation |
_version_ |
1721517779628064768 |