Entity perception of Two-Step-Matching framework for public opinions

Entity perception of ambiguous user comments is a critical problem of target identification for huge amount of public opinions. In this paper, a Two-Step-Matching method is proposed to identify the precise target entity from multiple entities mentioned. Firstly, potential entities are extracted by B...

Full description

Bibliographic Details
Main Authors: Ren-De Li, Hao-Tian Ma, Zi-Yi Wang, Qiang Guo, Jian-Guo Liu
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2020-09-01
Series:Journal of Safety Science and Resilience
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666449620300050
Description
Summary:Entity perception of ambiguous user comments is a critical problem of target identification for huge amount of public opinions. In this paper, a Two-Step-Matching method is proposed to identify the precise target entity from multiple entities mentioned. Firstly, potential entities are extracted by BiLSTM-CRF model and characteristic words by TF-IDF model from public comments. Secondly, the first matching is implemented between potential entities and an official business directory by Jaro–Winkler distance algorithm. Then, in order to find the precise one, an industry-characteristic dictionary is developed into the second matching process. The precise entity is identified according to the count of characteristic words matching to industry-characteristic dictionary. In addition, associated rate (global indicator) and accuracy rate (sample indicator) are defined for evaluation of matching accuracy. The results for three data sets of public opinions about major public health events show that the highest associated rate and accuracy rate arrive at 0.93 and 0.95, averagely enhanced by 32% and 30% above the case of using the first matching process alone. This framework provides the method to find the true target entity of really wanted expression from public opinions.
ISSN:2666-4496