Open Domain Chinese Triples Hierarchical Extraction Method
Open domain relation prediction is an important task in triples extraction. When faced with the task of constructing large-scale knowledge graph systems, with the exception of structured data, it is necessary to automatically extract triples from a large amount of unstructured text to expand entitie...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-07-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/14/4819 |
id |
doaj-c28affcf5e5947979e56dd122df2ceac |
---|---|
record_format |
Article |
spelling |
doaj-c28affcf5e5947979e56dd122df2ceac2020-11-25T02:17:10ZengMDPI AGApplied Sciences2076-34172020-07-01104819481910.3390/app10144819Open Domain Chinese Triples Hierarchical Extraction MethodChunhui He0Zhen Tan1Haoran Wang2Chong Zhang3Yanli Hu4Bin Ge5Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, ChinaScience and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, ChinaScience and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, ChinaScience and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, ChinaScience and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, ChinaScience and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, ChinaOpen domain relation prediction is an important task in triples extraction. When faced with the task of constructing large-scale knowledge graph systems, with the exception of structured data, it is necessary to automatically extract triples from a large amount of unstructured text to expand entities and relations. Although a large number of English open relation prediction methods have achieved good performance, the high-performance system for open domain Chinese triples extraction remains undeveloped due to the lack of large-scale Chinese annotation corpora and the difficulty of Chinese language processing. In this paper, we propose an integrated open domain Chinese triples hierarchical extraction method (CTHE) to solve this problem, considering the advantages of Bi-LSTM-CRF and Att-Bi-GRU models based on the pre-trained BERT encoding model. This method can recognize the named entities from Chinese sentences to establish entity pairs, and implement hierarchical extraction of specific and open relations based on the user-defined schema library and attention mechanism. The experimental results demonstrate the effectiveness of this method, which achieved stable performance on the test dataset, and better precision and F1-score in comparison with state-of-the-art Chinese open domain triples extraction methods. Furthermore, a large-scale annotated dataset for a Chinese named entity recognition (NER) task is established, which provides support for research on Chinese NER tasks.https://www.mdpi.com/2076-3417/10/14/4819named entity recognitionopen relation predictioninformation extractionCTHE |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Chunhui He Zhen Tan Haoran Wang Chong Zhang Yanli Hu Bin Ge |
spellingShingle |
Chunhui He Zhen Tan Haoran Wang Chong Zhang Yanli Hu Bin Ge Open Domain Chinese Triples Hierarchical Extraction Method Applied Sciences named entity recognition open relation prediction information extraction CTHE |
author_facet |
Chunhui He Zhen Tan Haoran Wang Chong Zhang Yanli Hu Bin Ge |
author_sort |
Chunhui He |
title |
Open Domain Chinese Triples Hierarchical Extraction Method |
title_short |
Open Domain Chinese Triples Hierarchical Extraction Method |
title_full |
Open Domain Chinese Triples Hierarchical Extraction Method |
title_fullStr |
Open Domain Chinese Triples Hierarchical Extraction Method |
title_full_unstemmed |
Open Domain Chinese Triples Hierarchical Extraction Method |
title_sort |
open domain chinese triples hierarchical extraction method |
publisher |
MDPI AG |
series |
Applied Sciences |
issn |
2076-3417 |
publishDate |
2020-07-01 |
description |
Open domain relation prediction is an important task in triples extraction. When faced with the task of constructing large-scale knowledge graph systems, with the exception of structured data, it is necessary to automatically extract triples from a large amount of unstructured text to expand entities and relations. Although a large number of English open relation prediction methods have achieved good performance, the high-performance system for open domain Chinese triples extraction remains undeveloped due to the lack of large-scale Chinese annotation corpora and the difficulty of Chinese language processing. In this paper, we propose an integrated open domain Chinese triples hierarchical extraction method (CTHE) to solve this problem, considering the advantages of Bi-LSTM-CRF and Att-Bi-GRU models based on the pre-trained BERT encoding model. This method can recognize the named entities from Chinese sentences to establish entity pairs, and implement hierarchical extraction of specific and open relations based on the user-defined schema library and attention mechanism. The experimental results demonstrate the effectiveness of this method, which achieved stable performance on the test dataset, and better precision and F1-score in comparison with state-of-the-art Chinese open domain triples extraction methods. Furthermore, a large-scale annotated dataset for a Chinese named entity recognition (NER) task is established, which provides support for research on Chinese NER tasks. |
topic |
named entity recognition open relation prediction information extraction CTHE |
url |
https://www.mdpi.com/2076-3417/10/14/4819 |
work_keys_str_mv |
AT chunhuihe opendomainchinesetripleshierarchicalextractionmethod AT zhentan opendomainchinesetripleshierarchicalextractionmethod AT haoranwang opendomainchinesetripleshierarchicalextractionmethod AT chongzhang opendomainchinesetripleshierarchicalextractionmethod AT yanlihu opendomainchinesetripleshierarchicalextractionmethod AT binge opendomainchinesetripleshierarchicalextractionmethod |
_version_ |
1724887761288167424 |