A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena

碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === This thesis proposed a textual entailment classification system developed based on the dataset focusing on individual entailment-related linguistic phenomena. For each text pair in this dataset, its entailment relationship label and related linguistic phenomeno...

Full description

Bibliographic Details
Main Authors: Liu, Chi-Ting, 劉耆定
Other Authors: Lin, Chuan-Jie
Format: Others
Language:zh-TW
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/89783556627047357483
id ndltd-TW-104NTOU5394018
record_format oai_dc
spelling ndltd-TW-104NTOU53940182017-09-24T04:40:47Z http://ndltd.ncl.edu.tw/handle/89783556627047357483 A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena 針對語言現象特例化資料集設計之中文文本蘊涵系統 Liu, Chi-Ting 劉耆定 碩士 國立臺灣海洋大學 資訊工程學系 104 This thesis proposed a textual entailment classification system developed based on the dataset focusing on individual entailment-related linguistic phenomena. For each text pair in this dataset, its entailment relationship label and related linguistic phenomenon are provided. Given a sentence pair, necessary linguistic preprocessing is performed. Identical and synonymous terms in the sentences are aligned in order to find differences between the sentences. Several resources are used to define synonyms. Among them, Wikipedia provides the most useful synonym sets. Two different kinds of textual entailment classification systems were proposed: rule-based and ML-based. Our rule-based system consists of several classification modules according the differences on the quantity, temporal, spatial, hypernymy, antonym, negation, or syntax information. The final decision is made by selecting the results from these modules in the order of contradiction, independence, forward- and bidirectional-entailment. The experiment results show that the rules invented according to the linguistic phenomena can improve the performance of textual entailment classification. Our hybrid system achieves a macro-averaged F1-measure of 48.61% and an accuracy of 49.42%, which outperforms the best systems in the NTCIR-11 RITE-VAL System Validation Chinese subtasks. Lin, Chuan-Jie 林川傑 2016 學位論文 ; thesis 61 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 104 === This thesis proposed a textual entailment classification system developed based on the dataset focusing on individual entailment-related linguistic phenomena. For each text pair in this dataset, its entailment relationship label and related linguistic phenomenon are provided. Given a sentence pair, necessary linguistic preprocessing is performed. Identical and synonymous terms in the sentences are aligned in order to find differences between the sentences. Several resources are used to define synonyms. Among them, Wikipedia provides the most useful synonym sets. Two different kinds of textual entailment classification systems were proposed: rule-based and ML-based. Our rule-based system consists of several classification modules according the differences on the quantity, temporal, spatial, hypernymy, antonym, negation, or syntax information. The final decision is made by selecting the results from these modules in the order of contradiction, independence, forward- and bidirectional-entailment. The experiment results show that the rules invented according to the linguistic phenomena can improve the performance of textual entailment classification. Our hybrid system achieves a macro-averaged F1-measure of 48.61% and an accuracy of 49.42%, which outperforms the best systems in the NTCIR-11 RITE-VAL System Validation Chinese subtasks.
author2 Lin, Chuan-Jie
author_facet Lin, Chuan-Jie
Liu, Chi-Ting
劉耆定
author Liu, Chi-Ting
劉耆定
spellingShingle Liu, Chi-Ting
劉耆定
A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena
author_sort Liu, Chi-Ting
title A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena
title_short A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena
title_full A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena
title_fullStr A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena
title_full_unstemmed A Chinese Textual Entailment System Developed for a Specialized Dataset on Linguistic Phenomena
title_sort chinese textual entailment system developed for a specialized dataset on linguistic phenomena
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/89783556627047357483
work_keys_str_mv AT liuchiting achinesetextualentailmentsystemdevelopedforaspecializeddatasetonlinguisticphenomena
AT liúqídìng achinesetextualentailmentsystemdevelopedforaspecializeddatasetonlinguisticphenomena
AT liuchiting zhēnduìyǔyánxiànxiàngtèlìhuàzīliàojíshèjìzhīzhōngwénwénběnyùnhánxìtǒng
AT liúqídìng zhēnduìyǔyánxiànxiàngtèlìhuàzīliàojíshèjìzhīzhōngwénwénběnyùnhánxìtǒng
AT liuchiting chinesetextualentailmentsystemdevelopedforaspecializeddatasetonlinguisticphenomena
AT liúqídìng chinesetextualentailmentsystemdevelopedforaspecializeddatasetonlinguisticphenomena
_version_ 1718540216888721408