QUANTIFYING SEMANTIC SHIFT VISUALLY ON A MALAY DOMAIN SPECIFIC CORPUS USING TEMPORAL WORD EMBEDDING APPROACH
In this study, we propose an alternative approach to analyzing a domain-specific time series corpus for detecting word evolution. The method trains a target corpus in time series into a temporal word embedding (TWE) model. The advantage of TWE is that one can see how the meaning of a word changes ov...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
UKM Press
2020-12-01
|
Series: | Asia-Pacific Journal of Information Technology and Multimedia |
Subjects: | |
Online Access: | https://www.ukm.my/apjitm/view.php?id=197 |
id |
doaj-95661df5a7b04ae18e9dc169237fbe72 |
---|---|
record_format |
Article |
spelling |
doaj-95661df5a7b04ae18e9dc169237fbe722021-06-30T06:25:12ZengUKM PressAsia-Pacific Journal of Information Technology and Multimedia2289-21922020-12-010902110https://doi.org/10.17576/apjitm-2020-0902-01QUANTIFYING SEMANTIC SHIFT VISUALLY ON A MALAY DOMAIN SPECIFIC CORPUS USING TEMPORAL WORD EMBEDDING APPROACHSabrina TiunSaidah SaadNor Fariza Mohd NoorAzhar JalaludinAnis Nadiah Che Abdul RahmanIn this study, we propose an alternative approach to analyzing a domain-specific time series corpus for detecting word evolution. The method trains a target corpus in time series into a temporal word embedding (TWE) model. The advantage of TWE is that one can see how the meaning of a word changes over time. We have chosen the TWEC approach to model a Malay domain-specific time-series corpus, the Malaysian Hansard Corpus (MHC), to a TWE model and called the model as MHC-TWEC. Two primary analyses, i.e., self-similarity analysis and user-defined method analysis, were performed to validate the effectiveness of the MHC-TWEC model in quantifying semantic shift on MHC visually. From those analyses, we visually find out that the TWE model can capture the semantic shift in the temporal corpus (the MHC).https://www.ukm.my/apjitm/view.php?id=197temporal word embeddingtemporal corpusmalaysian hansard corpus |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Sabrina Tiun Saidah Saad Nor Fariza Mohd Noor Azhar Jalaludin Anis Nadiah Che Abdul Rahman |
spellingShingle |
Sabrina Tiun Saidah Saad Nor Fariza Mohd Noor Azhar Jalaludin Anis Nadiah Che Abdul Rahman QUANTIFYING SEMANTIC SHIFT VISUALLY ON A MALAY DOMAIN SPECIFIC CORPUS USING TEMPORAL WORD EMBEDDING APPROACH Asia-Pacific Journal of Information Technology and Multimedia temporal word embedding temporal corpus malaysian hansard corpus |
author_facet |
Sabrina Tiun Saidah Saad Nor Fariza Mohd Noor Azhar Jalaludin Anis Nadiah Che Abdul Rahman |
author_sort |
Sabrina Tiun |
title |
QUANTIFYING SEMANTIC SHIFT VISUALLY ON A MALAY DOMAIN SPECIFIC CORPUS USING TEMPORAL WORD EMBEDDING APPROACH |
title_short |
QUANTIFYING SEMANTIC SHIFT VISUALLY ON A MALAY DOMAIN SPECIFIC CORPUS USING TEMPORAL WORD EMBEDDING APPROACH |
title_full |
QUANTIFYING SEMANTIC SHIFT VISUALLY ON A MALAY DOMAIN SPECIFIC CORPUS USING TEMPORAL WORD EMBEDDING APPROACH |
title_fullStr |
QUANTIFYING SEMANTIC SHIFT VISUALLY ON A MALAY DOMAIN SPECIFIC CORPUS USING TEMPORAL WORD EMBEDDING APPROACH |
title_full_unstemmed |
QUANTIFYING SEMANTIC SHIFT VISUALLY ON A MALAY DOMAIN SPECIFIC CORPUS USING TEMPORAL WORD EMBEDDING APPROACH |
title_sort |
quantifying semantic shift visually on a malay domain specific corpus using temporal word embedding approach |
publisher |
UKM Press |
series |
Asia-Pacific Journal of Information Technology and Multimedia |
issn |
2289-2192 |
publishDate |
2020-12-01 |
description |
In this study, we propose an alternative approach to analyzing a domain-specific time series corpus for detecting word evolution. The method trains a target corpus in time series into a temporal word embedding (TWE) model. The advantage of TWE is that one can see how the meaning of a word changes over time. We have chosen the TWEC approach to model a Malay domain-specific time-series corpus, the Malaysian Hansard Corpus (MHC), to a TWE model and called the model as MHC-TWEC. Two primary analyses, i.e., self-similarity analysis and user-defined method analysis, were performed to validate the effectiveness of the MHC-TWEC model in quantifying semantic shift on MHC visually. From those analyses, we visually find out that the TWE model can capture the semantic shift in the temporal corpus (the MHC). |
topic |
temporal word embedding temporal corpus malaysian hansard corpus |
url |
https://www.ukm.my/apjitm/view.php?id=197 |
work_keys_str_mv |
AT sabrinatiun quantifyingsemanticshiftvisuallyonamalaydomainspecificcorpususingtemporalwordembeddingapproach AT saidahsaad quantifyingsemanticshiftvisuallyonamalaydomainspecificcorpususingtemporalwordembeddingapproach AT norfarizamohdnoor quantifyingsemanticshiftvisuallyonamalaydomainspecificcorpususingtemporalwordembeddingapproach AT azharjalaludin quantifyingsemanticshiftvisuallyonamalaydomainspecificcorpususingtemporalwordembeddingapproach AT anisnadiahcheabdulrahman quantifyingsemanticshiftvisuallyonamalaydomainspecificcorpususingtemporalwordembeddingapproach |
_version_ |
1721353253273206784 |