Summary: | 碩士 === 國立清華大學 === 資訊系統與應用研究所 === 93 === We present an approach for disambiguating word senses of an adjective in a given sentence based on collocates and semantic relationships in WordNet. In our approach, we use bootstrapping to learn a list of collocates for each word sense of the adjective from a small amount of sense-tagged samples and a very large untagged corpus.
The method involves extracting collocates, sense-labeled and unlabeled, of the adjective from the training data and untagged corpus, assigning labels to the unlabeled collocates by measuring WordNet-based similarities between labeled and unlabeled collocates, and building a WSD model from the labeled collocates. At runtime, collocates of the adjective are identified and compared with labeled collocates. The adjective is then disambiguate according to the sense labels of the three most similar collocates.
We experimented with an implementation of the proposed method using SemCor, Senseval-2 lexical sample training set, and British National Corpus (BNC). Evaluation on collocates of the six adjectives selected from Senseval-2 shows that the WordNet-based bootstrapping approach performs better than previous researches on word sense disambiguation (WSD) of adjectives. Therefore, it is reasonable to conclude that the accuracy of word sense disambiguation of adjectives can be improved by computing WordNet-based similarities among collocates of the adjectives.
|