Chinese Sentiment Word Acquisition and Its Applications to Opinion Extraction

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 93 === Opinions may be explicitly or implicitly embedded in documents. They are useful information and viewpoints to improve services of government or products of companies. We consider that an opinion is a statement expressed towards a topic and contains sentiments....

Full description

Bibliographic Details
Main Authors: Tung-Ho Wu, 吳東和
Other Authors: 陳信希
Format: Others
Language:en_US
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/64039269154360333652
id ndltd-TW-093NTU05392080
record_format oai_dc
spelling ndltd-TW-093NTU053920802015-12-21T04:04:14Z http://ndltd.ncl.edu.tw/handle/64039269154360333652 Chinese Sentiment Word Acquisition and Its Applications to Opinion Extraction 中文情緒詞彙自動學習及在意見擷取之應用 Tung-Ho Wu 吳東和 碩士 國立臺灣大學 資訊工程學研究所 93 Opinions may be explicitly or implicitly embedded in documents. They are useful information and viewpoints to improve services of government or products of companies. We consider that an opinion is a statement expressed towards a topic and contains sentiments. Sentiment words determine the opinion type of an opinion passage and the overall opinion tendency of a document. Sentiment words are the key features in opinion extraction. We propose three approaches, including the Thesaurus-Based Approach, the Character-Based Approach and the Combined Approach, to determine whether an unknown word is positive, negative or non-sentiment. The Thesaurus-Based Approach utilizes the synonym information to classify an unknown word. The Character-Based Approach computes the sentiment score of a Chinese word based on its composite characters and classifies a word by its sentiment score information. The Combined Approach utilizes the synonym information and sentiment scores to classify an unknown word. This approach is the best among these approaches. The F-measure is 73.18% and 63.75% for verbs and nouns, respectively under strict assessment by human. The average F-measure is 70.40%. Finally, we propose the Sentiment Miner based on the Combined Approach to acquire new positive and negative sentiment words from documents. For opinion extraction, we propose the Passage Level Algorithm to detect the opinion passages inside a document. This algorithm utilizes sentiment words and context information. We also propose the Document Level Algorithm to determine the overall opinion tendency of a document based on the opinion passages inside the document. In experiments, the best F-measure is 62.16% at the passage level and 76.56% at the document level. 陳信希 2005 學位論文 ; thesis 48 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立臺灣大學 === 資訊工程學研究所 === 93 === Opinions may be explicitly or implicitly embedded in documents. They are useful information and viewpoints to improve services of government or products of companies. We consider that an opinion is a statement expressed towards a topic and contains sentiments. Sentiment words determine the opinion type of an opinion passage and the overall opinion tendency of a document. Sentiment words are the key features in opinion extraction. We propose three approaches, including the Thesaurus-Based Approach, the Character-Based Approach and the Combined Approach, to determine whether an unknown word is positive, negative or non-sentiment. The Thesaurus-Based Approach utilizes the synonym information to classify an unknown word. The Character-Based Approach computes the sentiment score of a Chinese word based on its composite characters and classifies a word by its sentiment score information. The Combined Approach utilizes the synonym information and sentiment scores to classify an unknown word. This approach is the best among these approaches. The F-measure is 73.18% and 63.75% for verbs and nouns, respectively under strict assessment by human. The average F-measure is 70.40%. Finally, we propose the Sentiment Miner based on the Combined Approach to acquire new positive and negative sentiment words from documents. For opinion extraction, we propose the Passage Level Algorithm to detect the opinion passages inside a document. This algorithm utilizes sentiment words and context information. We also propose the Document Level Algorithm to determine the overall opinion tendency of a document based on the opinion passages inside the document. In experiments, the best F-measure is 62.16% at the passage level and 76.56% at the document level.
author2 陳信希
author_facet 陳信希
Tung-Ho Wu
吳東和
author Tung-Ho Wu
吳東和
spellingShingle Tung-Ho Wu
吳東和
Chinese Sentiment Word Acquisition and Its Applications to Opinion Extraction
author_sort Tung-Ho Wu
title Chinese Sentiment Word Acquisition and Its Applications to Opinion Extraction
title_short Chinese Sentiment Word Acquisition and Its Applications to Opinion Extraction
title_full Chinese Sentiment Word Acquisition and Its Applications to Opinion Extraction
title_fullStr Chinese Sentiment Word Acquisition and Its Applications to Opinion Extraction
title_full_unstemmed Chinese Sentiment Word Acquisition and Its Applications to Opinion Extraction
title_sort chinese sentiment word acquisition and its applications to opinion extraction
publishDate 2005
url http://ndltd.ncl.edu.tw/handle/64039269154360333652
work_keys_str_mv AT tunghowu chinesesentimentwordacquisitionanditsapplicationstoopinionextraction
AT wúdōnghé chinesesentimentwordacquisitionanditsapplicationstoopinionextraction
AT tunghowu zhōngwénqíngxùcíhuìzìdòngxuéxíjízàiyìjiànxiéqǔzhīyīngyòng
AT wúdōnghé zhōngwénqíngxùcíhuìzìdòngxuéxíjízàiyìjiànxiéqǔzhīyīngyòng
_version_ 1718154560906723328