Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition

博士 === 國立交通大學 === 資訊科學與工程研究所 === 99 === Anaphora is a commonly observed linguistic phenomenon and used to avoid repetition of expressions in discourses. Anaphora resolution denotes the process of identifying the antecedent of an anaphor in a context. Effective anaphora resolution plays an essential...

Full description

Bibliographic Details
Main Authors:	Wu, Dian-Song, 吳典松
Other Authors:	Liang, Tyne
Format:	Others
Language:	en_US
Published:	2010
Online Access:	http://ndltd.ncl.edu.tw/handle/17517841566588913082

id	ndltd-TW-099NCTU5394050
record_format	oai_dc
spelling	ndltd-TW-099NCTU53940502016-04-08T04:22:00Z http://ndltd.ncl.edu.tw/handle/17517841566588913082 Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition 以權重學習與知識擷取為基礎之中文指代消解研究 Wu, Dian-Song 吳典松博士國立交通大學資訊科學與工程研究所 99 Anaphora is a commonly observed linguistic phenomenon and used to avoid repetition of expressions in discourses. Anaphora resolution denotes the process of identifying the antecedent of an anaphor in a context. Effective anaphora resolution plays an essential role in many applications of natural language processing such as machine translation, summarization, and information extraction. In previous research, anaphora resolution methods have relied on syntactic rules, semantic or pragmatic clues to identify the antecedent. More recently, statistical-based or classification-based approaches are focused. However, in a rule-based approach, a salience score by manual weight assignment is usually adopted to select the antecedent. Errors may occur due to intuitive observations and subjective biases in selecting feature weight. On the other hand, the drawback of a classification-based approach is that it considers different candidates for the same anaphor independently. Thus it cannot effectively capture the preference relationships between competing candidates during resolution. To overcome these problems, we propose Chinese anaphora resolution methods based on weight learning and knowledge acquisition. In this thesis, pronominal, zero, and definite anaphora in Chinese texts are addressed and different approaches are presented. We use lexical knowledge acquisition and salience measurement to resolve Chinese pronominal anaphora. The lexical knowledge acquisition is aimed to extract more semantic features, such as gender, number, and collocate compatibility. The presented salience measurement is based on entropy-based weighting on selecting antecedent candidates. The experimental results show that our proposed approach yields 82.5% success rate on 1343 anaphoric instances, enhancing 7% improvement while compared with the general rule-based approach presented. As to Chinese zero anaphora, we apply case-based reasoning and pattern conceptualization to overcome the difficulties of constructing proper reasoning mechanisms and insufficiency of lexical features. The experimental results show that our proposed approach achieved competitive resolution by yielding 79% F-score on 1051 anaphoric instances and yielded 13% improvement while compared with the general rule-based approach. We use two strategies to resolve Chinese definite anaphors. One is an adaptive weight salience measurement in such a way that the entire set of candidates can be estimated simultaneously. Another scheme is a Web-based knowledge acquisition model so that semantic compatibility extraction and multiple resources can be employed. The experimental results show that our proposed approach yields 72.5% success rate on 426 anaphoric instances, enhancing 4.7% improvement while compared with the result conducted by a conventional classifier. Liang, Tyne 梁婷 2010 學位論文 ; thesis 78 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	博士 === 國立交通大學 === 資訊科學與工程研究所 === 99 === Anaphora is a commonly observed linguistic phenomenon and used to avoid repetition of expressions in discourses. Anaphora resolution denotes the process of identifying the antecedent of an anaphor in a context. Effective anaphora resolution plays an essential role in many applications of natural language processing such as machine translation, summarization, and information extraction. In previous research, anaphora resolution methods have relied on syntactic rules, semantic or pragmatic clues to identify the antecedent. More recently, statistical-based or classification-based approaches are focused. However, in a rule-based approach, a salience score by manual weight assignment is usually adopted to select the antecedent. Errors may occur due to intuitive observations and subjective biases in selecting feature weight. On the other hand, the drawback of a classification-based approach is that it considers different candidates for the same anaphor independently. Thus it cannot effectively capture the preference relationships between competing candidates during resolution. To overcome these problems, we propose Chinese anaphora resolution methods based on weight learning and knowledge acquisition. In this thesis, pronominal, zero, and definite anaphora in Chinese texts are addressed and different approaches are presented. We use lexical knowledge acquisition and salience measurement to resolve Chinese pronominal anaphora. The lexical knowledge acquisition is aimed to extract more semantic features, such as gender, number, and collocate compatibility. The presented salience measurement is based on entropy-based weighting on selecting antecedent candidates. The experimental results show that our proposed approach yields 82.5% success rate on 1343 anaphoric instances, enhancing 7% improvement while compared with the general rule-based approach presented. As to Chinese zero anaphora, we apply case-based reasoning and pattern conceptualization to overcome the difficulties of constructing proper reasoning mechanisms and insufficiency of lexical features. The experimental results show that our proposed approach achieved competitive resolution by yielding 79% F-score on 1051 anaphoric instances and yielded 13% improvement while compared with the general rule-based approach. We use two strategies to resolve Chinese definite anaphors. One is an adaptive weight salience measurement in such a way that the entire set of candidates can be estimated simultaneously. Another scheme is a Web-based knowledge acquisition model so that semantic compatibility extraction and multiple resources can be employed. The experimental results show that our proposed approach yields 72.5% success rate on 426 anaphoric instances, enhancing 4.7% improvement while compared with the result conducted by a conventional classifier.
author2	Liang, Tyne
author_facet	Liang, Tyne Wu, Dian-Song 吳典松
author	Wu, Dian-Song 吳典松
spellingShingle	Wu, Dian-Song 吳典松 Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition
author_sort	Wu, Dian-Song
title	Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition
title_short	Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition
title_full	Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition
title_fullStr	Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition
title_full_unstemmed	Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition
title_sort	chinese anaphora resolution based on weight learning and knowledge acquisition
publishDate	2010
url	http://ndltd.ncl.edu.tw/handle/17517841566588913082
work_keys_str_mv	AT wudiansong chineseanaphoraresolutionbasedonweightlearningandknowledgeacquisition AT wúdiǎnsōng chineseanaphoraresolutionbasedonweightlearningandknowledgeacquisition AT wudiansong yǐquánzhòngxuéxíyǔzhīshíxiéqǔwèijīchǔzhīzhōngwénzhǐdàixiāojiěyánjiū AT wúdiǎnsōng yǐquánzhòngxuéxíyǔzhīshíxiéqǔwèijīchǔzhīzhōngwénzhǐdàixiāojiěyánjiū
_version_	1718219213423771648

Chinese Anaphora Resolution Based on Weight Learning and Knowledge Acquisition

Similar Items