MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model

The knowledge graph completion (KGC) task aims to predict missing links in knowledge graphs. Recently, several KGC models based on translational distance or semantic matching methods have been proposed and have achieved meaningful results. However, existing models have a significant shortcoming&...

Full description

Bibliographic Details
Main Authors: Bonggeun Choi, Daesik Jang, Youngjoong Ko
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9540703/
id doaj-fce518d87707486bb93a823e24726877
record_format Article
spelling doaj-fce518d87707486bb93a823e247268772021-09-30T23:01:15ZengIEEEIEEE Access2169-35362021-01-01913202513203210.1109/ACCESS.2021.31133299540703MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language ModelBonggeun Choi0https://orcid.org/0000-0001-5689-9789Daesik Jang1https://orcid.org/0000-0003-1978-8312Youngjoong Ko2https://orcid.org/0000-0002-0241-9193Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Computer Science and Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Computer Science and Engineering, Sungkyunkwan University, Suwon, South KoreaThe knowledge graph completion (KGC) task aims to predict missing links in knowledge graphs. Recently, several KGC models based on translational distance or semantic matching methods have been proposed and have achieved meaningful results. However, existing models have a significant shortcoming&#x2013;they cannot train entity embedding when an entity does not appear in the training phase. As a result, such models use randomly initialized embeddings for entities that are unseen in the training phase and cause a critical decrease in performance during the test phase. To solve this problem, we propose a new approach that performs KGC task by utilizing the masked language model (MLM) that is used for a pre-trained language model. Given a triple (<italic>head entity</italic>, <italic>relation</italic>, <italic>tail entity</italic>), we mask the tail entity and consider the head entity and the relation as a context for the tail entity. The model then predicts the masked entity from among all entities. Then, the task is conducted by the same process as an MLM, which predicts a masked token with a given context of tokens. Our experimental results show that the proposed model achieves significantly improved performances when unseen entities appear during the test phase and achieves state-of-the-art performance on the WN18RR dataset.https://ieeexplore.ieee.org/document/9540703/Knowledge graph completionlink predictionmasked language modelpre-trained language model
collection DOAJ
language English
format Article
sources DOAJ
author Bonggeun Choi
Daesik Jang
Youngjoong Ko
spellingShingle Bonggeun Choi
Daesik Jang
Youngjoong Ko
MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
IEEE Access
Knowledge graph completion
link prediction
masked language model
pre-trained language model
author_facet Bonggeun Choi
Daesik Jang
Youngjoong Ko
author_sort Bonggeun Choi
title MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
title_short MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
title_full MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
title_fullStr MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
title_full_unstemmed MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
title_sort mem-kgc: masked entity model for knowledge graph completion with pre-trained language model
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description The knowledge graph completion (KGC) task aims to predict missing links in knowledge graphs. Recently, several KGC models based on translational distance or semantic matching methods have been proposed and have achieved meaningful results. However, existing models have a significant shortcoming&#x2013;they cannot train entity embedding when an entity does not appear in the training phase. As a result, such models use randomly initialized embeddings for entities that are unseen in the training phase and cause a critical decrease in performance during the test phase. To solve this problem, we propose a new approach that performs KGC task by utilizing the masked language model (MLM) that is used for a pre-trained language model. Given a triple (<italic>head entity</italic>, <italic>relation</italic>, <italic>tail entity</italic>), we mask the tail entity and consider the head entity and the relation as a context for the tail entity. The model then predicts the masked entity from among all entities. Then, the task is conducted by the same process as an MLM, which predicts a masked token with a given context of tokens. Our experimental results show that the proposed model achieves significantly improved performances when unseen entities appear during the test phase and achieves state-of-the-art performance on the WN18RR dataset.
topic Knowledge graph completion
link prediction
masked language model
pre-trained language model
url https://ieeexplore.ieee.org/document/9540703/
work_keys_str_mv AT bonggeunchoi memkgcmaskedentitymodelforknowledgegraphcompletionwithpretrainedlanguagemodel
AT daesikjang memkgcmaskedentitymodelforknowledgegraphcompletionwithpretrainedlanguagemodel
AT youngjoongko memkgcmaskedentitymodelforknowledgegraphcompletionwithpretrainedlanguagemodel
_version_ 1716862672809492480