Summary: | 碩士 === 國立成功大學 === 資訊工程學系 === 104 === More and more users use social media to share opinions on a variety of topics. People’s opinions and sentiments expressed on social media are useful resources for consumers and companies that can help them to make decisions. Due to information explosion, it is difficult for people to read and summarize the sentiment orientation of social media. Previous studies have often used a sentiment analysis method to mine and summarize user opinions of items at document level. However, extracting sentiment orientation at the document level has some limitations. We cannot determine which aspects of the items that users are happy or unhappy about. Sequential pattern mining can be implemented to extract such aspects. However, traditional sequential pattern mining has two problems for aspect extraction, including lattice structure problem and flexibility problem. Furthermore, if we extract these aspects and classify their sentiment orientation precisely, there will be too many aspects for people to summarize the information they want to see. Because of the problems mentioned above, this study proposed a framework of aspect-based pros and cons summarization with flexible sequential rule mining proposed to help users to summarize information and make decisions based on social media.
The aim of this study is to develop an aspect-based pros and cons summarization system. Flexible sequential rule mining is proposed to discover sequential rules and utilize the rules to extract the aspects of an item. Then, sentiment orientation of aspects are classified via the machine-learning approach. Upon sentiment classification of the aspects, we use aspects and the sentiment information to summarize some pros and cons for a given query (item) based on social media.
In the experiment, we built an experimental data set to evaluate the accuracy of aspect retrieval and aspect-based sentiment classification. In the experiments on aspect retrieval, the results showed that flexible sequential rule mining has high precision but low recall. Furthermore, it obtained better performance than baseline methods. In the experiments on aspect-based sentiment classification, the results showed that aspect retrieval is an important part of the sentiment classification of aspects. Furthermore, the performance of our model was better than that of baseline methods. Finally, we selected two products had the smallest and largest quantity of data to discuss the results of pros and cons summarization. The performance of small data was better than that of large data. Most of the experimental data belonged to a particular area. Therefore, the rules generated from the experimental dataset on this product may not be applicable to other contexts.
In this study, we developed an aspect-based pros and cons summarization system. Through building an experimental data set for evaluation and a case study, we verified the availability and validity of our system. Finally, we expect the system will help people read and summarize sentiment orientation from social media that will assist them in making decisions or determine marketing strategies.
|