Design of a Chinese Opinion Mining System

碩士 === 淡江大學 === 資訊工程學系碩士班 === 100 === Since the Chinese grammatical structure is different from English, there is no interval space in between Chinese words. Using POS or Parser in search of opinion words can easily lead to errors. Therefore, when capturing opinion words by using the thesaurus (lexi...

Full description

Bibliographic Details
Main Authors:	Lee Chien, 簡立
Other Authors:	Rui-Dong Chiang
Format:	Others
Language:	zh-TW
Published:	2012
Online Access:	http://ndltd.ncl.edu.tw/handle/99856694618811977409

id	ndltd-TW-100TKU05392077
record_format	oai_dc
spelling	ndltd-TW-100TKU053920772015-10-13T21:27:35Z http://ndltd.ncl.edu.tw/handle/99856694618811977409 Design of a Chinese Opinion Mining System 中文意見探勘系統設計 Lee Chien 簡立碩士淡江大學資訊工程學系碩士班 100 Since the Chinese grammatical structure is different from English, there is no interval space in between Chinese words. Using POS or Parser in search of opinion words can easily lead to errors. Therefore, when capturing opinion words by using the thesaurus (lexicon) way, this study uses the proposed exclusion word method to improve the opinion word capturing precision. As each of the different fields has different terminologies or idioms (opinion words and exclusion words), ordinary dictionaries can hardly cover all the opinion words in a specific field. However, for a specific field, as long as the training data are sufficient, most of the opinion words and exclusion words outside the dictionaries can be captured. The opinion words and exclusion words outside the dictionaries that have not been included in the training set are few, and at a stable state. Moreover, they are often opinion words and exclusion words that are not frequently used. This paper uses the experimental data of two different but similar fields of Mobile01 telecommunications. As this paper uses the thesaurus/lexicon way to capture the opinion words and exclusion words, all the opinion words and exclusion words in dictionaries can be captured. The opinion words and exclusion words outside the dictionaries can be determined only by manual tagging, which is time and labor consuming. Therefore, according to the stability of the new opinion words and exclusion words outside the dictionaries, this study attempts to design a two-stage lexicon training method to solve this problem. Regarding the proposed two-stage lexicon training method, the first stage is to capture the opinion words or exclusion words of training data by manual semi-automated tagging. The second stage is to directly use the dictionaries to capture the opinion words or exclusion words of the articles when the system is online before manually inspecting the accuracy of the captured opinion words and exclusion words. According to the experimental data, the training procedure of the second stage can save a great deal of time for manual tagging. Rui-Dong Chiang 蔣璿東 2012 學位論文 ; thesis 71 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 淡江大學 === 資訊工程學系碩士班 === 100 === Since the Chinese grammatical structure is different from English, there is no interval space in between Chinese words. Using POS or Parser in search of opinion words can easily lead to errors. Therefore, when capturing opinion words by using the thesaurus (lexicon) way, this study uses the proposed exclusion word method to improve the opinion word capturing precision. As each of the different fields has different terminologies or idioms (opinion words and exclusion words), ordinary dictionaries can hardly cover all the opinion words in a specific field. However, for a specific field, as long as the training data are sufficient, most of the opinion words and exclusion words outside the dictionaries can be captured. The opinion words and exclusion words outside the dictionaries that have not been included in the training set are few, and at a stable state. Moreover, they are often opinion words and exclusion words that are not frequently used. This paper uses the experimental data of two different but similar fields of Mobile01 telecommunications. As this paper uses the thesaurus/lexicon way to capture the opinion words and exclusion words, all the opinion words and exclusion words in dictionaries can be captured. The opinion words and exclusion words outside the dictionaries can be determined only by manual tagging, which is time and labor consuming. Therefore, according to the stability of the new opinion words and exclusion words outside the dictionaries, this study attempts to design a two-stage lexicon training method to solve this problem. Regarding the proposed two-stage lexicon training method, the first stage is to capture the opinion words or exclusion words of training data by manual semi-automated tagging. The second stage is to directly use the dictionaries to capture the opinion words or exclusion words of the articles when the system is online before manually inspecting the accuracy of the captured opinion words and exclusion words. According to the experimental data, the training procedure of the second stage can save a great deal of time for manual tagging.
author2	Rui-Dong Chiang
author_facet	Rui-Dong Chiang Lee Chien 簡立
author	Lee Chien 簡立
spellingShingle	Lee Chien 簡立 Design of a Chinese Opinion Mining System
author_sort	Lee Chien
title	Design of a Chinese Opinion Mining System
title_short	Design of a Chinese Opinion Mining System
title_full	Design of a Chinese Opinion Mining System
title_fullStr	Design of a Chinese Opinion Mining System
title_full_unstemmed	Design of a Chinese Opinion Mining System
title_sort	design of a chinese opinion mining system
publishDate	2012
url	http://ndltd.ncl.edu.tw/handle/99856694618811977409
work_keys_str_mv	AT leechien designofachineseopinionminingsystem AT jiǎnlì designofachineseopinionminingsystem AT leechien zhōngwényìjiàntànkānxìtǒngshèjì AT jiǎnlì zhōngwényìjiàntànkānxìtǒngshèjì
_version_	1718064115010764800

Design of a Chinese Opinion Mining System

Similar Items