LSTM-CRF Neural Network With Gated Self Attention for Chinese NER

Named entity recognition (NER) is an essential part of natural language processing tasks. Chinese NER task is different from the many European languages due to the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually regarded as the first step of processing Chinese NER....

Full description

Bibliographic Details
Main Authors: Yanliang Jin, Jinfei Xie, Weisi Guo, Can Luo, Dijia Wu, Rui Wang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8844740/
Description
Summary:Named entity recognition (NER) is an essential part of natural language processing tasks. Chinese NER task is different from the many European languages due to the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually regarded as the first step of processing Chinese NER. However, the word-based NER models relying on CWS are more vulnerable to incorrectly segmented entity boundaries and the presence of out-of-vocabulary (OOV) words. In this paper, we propose a novel character-based Gated Convolutional Recurrent neural network with Attention called GCRA for Chinese NER task. In particular, we introduce a hybrid convolutional neural network with gating filter mechanism to capture local context information and a highway neural network after LSTM to select characters of interest. The additional gated self-attention mechanism is used to capture the global dependencies from different multiple subspaces and arbitrary adjacent characters. We evaluate the performance of our proposed model on three datasets, including SIGHAN bakeoff 2006 MSRA, Chinese Resume, and Literature NER dataset. The experiment results show that our model outperforms other state-of-the-art models without relying on any external resources like lexicons and multi-task joint training.
ISSN:2169-3536