英中詞彙知識庫建構機制之研究

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 88 === This paper proposes a sense tagger for Mandarin Chinese. Using contextual information and the mapping from WordNet synsets to Cilin sense tags to deal with word sense disambiguation. The performance for tagging low(2-4), middle(5-8) and high(>8) ambiguous wor...

Full description

Bibliographic Details
Main Authors: Chi-Ching Lin, 林其青
Other Authors: Hsin-Hsi Chen
Format: Others
Language:zh-TW
Published: 2000
Online Access:http://ndltd.ncl.edu.tw/handle/34887029646704875197
id ndltd-TW-088NTU00392021
record_format oai_dc
spelling ndltd-TW-088NTU003920212016-01-29T04:18:37Z http://ndltd.ncl.edu.tw/handle/34887029646704875197 英中詞彙知識庫建構機制之研究 Chi-Ching Lin 林其青 碩士 國立臺灣大學 資訊工程學研究所 88 This paper proposes a sense tagger for Mandarin Chinese. Using contextual information and the mapping from WordNet synsets to Cilin sense tags to deal with word sense disambiguation. The performance for tagging low(2-4), middle(5-8) and high(>8) ambiguous words is 63.36% in average, when small categories(1428 senses) are used and 1-3 candidates are proposed, respectively. The performance of tagging unknown words is 34.35%, which is better than that of the baseline model. This sense tagger helps us set up a large-scale sense-tagged corpus from ASBC. This paper also proposes a method to construct Chinese-English WordNet automatically. According to the word senses, Chinese words are mapped to the WordNet synsets. Besides the mapping between Chinese Cilin senses and English WordNet synsets is built, we also set up a Chinese lexical knowlege base. The results are applied to Chinese-English information retrieval. When the Chinese-English WordNet is applied to our CLIR experiment, it achieves 69.7% of monolingual IR effectiveness. Hsin-Hsi Chen 陳信希 2000 學位論文 ; thesis 71 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 88 === This paper proposes a sense tagger for Mandarin Chinese. Using contextual information and the mapping from WordNet synsets to Cilin sense tags to deal with word sense disambiguation. The performance for tagging low(2-4), middle(5-8) and high(>8) ambiguous words is 63.36% in average, when small categories(1428 senses) are used and 1-3 candidates are proposed, respectively. The performance of tagging unknown words is 34.35%, which is better than that of the baseline model. This sense tagger helps us set up a large-scale sense-tagged corpus from ASBC. This paper also proposes a method to construct Chinese-English WordNet automatically. According to the word senses, Chinese words are mapped to the WordNet synsets. Besides the mapping between Chinese Cilin senses and English WordNet synsets is built, we also set up a Chinese lexical knowlege base. The results are applied to Chinese-English information retrieval. When the Chinese-English WordNet is applied to our CLIR experiment, it achieves 69.7% of monolingual IR effectiveness.
author2 Hsin-Hsi Chen
author_facet Hsin-Hsi Chen
Chi-Ching Lin
林其青
author Chi-Ching Lin
林其青
spellingShingle Chi-Ching Lin
林其青
英中詞彙知識庫建構機制之研究
author_sort Chi-Ching Lin
title 英中詞彙知識庫建構機制之研究
title_short 英中詞彙知識庫建構機制之研究
title_full 英中詞彙知識庫建構機制之研究
title_fullStr 英中詞彙知識庫建構機制之研究
title_full_unstemmed 英中詞彙知識庫建構機制之研究
title_sort 英中詞彙知識庫建構機制之研究
publishDate 2000
url http://ndltd.ncl.edu.tw/handle/34887029646704875197
work_keys_str_mv AT chichinglin yīngzhōngcíhuìzhīshíkùjiàngòujīzhìzhīyánjiū
AT línqíqīng yīngzhōngcíhuìzhīshíkùjiàngòujīzhìzhīyánjiū
_version_ 1718167354533216256