A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena

碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === This thesis proposed a textual entailment classification system developed based on the dataset focusing on individual entailment-related linguistic phenomena. For each text pair in this dataset, its entailment relationship label and related linguistic phenomeno...

Full description

Bibliographic Details
Main Authors: Liu, Chi-Ting, 劉耆定
Other Authors: Lin, Chuan-Jie
Format: Others
Language:zh-TW
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/89783556627047357483
Description
Summary:碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === This thesis proposed a textual entailment classification system developed based on the dataset focusing on individual entailment-related linguistic phenomena. For each text pair in this dataset, its entailment relationship label and related linguistic phenomenon are provided. Given a sentence pair, necessary linguistic preprocessing is performed. Identical and synonymous terms in the sentences are aligned in order to find differences between the sentences. Several resources are used to define synonyms. Among them, Wikipedia provides the most useful synonym sets. Two different kinds of textual entailment classification systems were proposed: rule-based and ML-based. Our rule-based system consists of several classification modules according the differences on the quantity, temporal, spatial, hypernymy, antonym, negation, or syntax information. The final decision is made by selecting the results from these modules in the order of contradiction, independence, forward- and bidirectional-entailment. The experiment results show that the rules invented according to the linguistic phenomena can improve the performance of textual entailment classification. Our hybrid system achieves a macro-averaged F1-measure of 48.61% and an accuracy of 49.42%, which outperforms the best systems in the NTCIR-11 RITE-VAL System Validation Chinese subtasks.