A Deep-Learning-Based Long Article Analysis Method for Sentiment Extraction from Japanese Animation Viewers Comments

碩士 === 國立臺灣科技大學 === 資訊工程系 === 107 === In an era of information, Internet volume is a major issue. We automatically classify emotions from multiple long comments through a computer to extract sentiment. The traditional long article summary and sentiment classification methods define a dictionary of s...

Full description

Bibliographic Details
Main Authors: Ying-Tse Lee, 李映澤
Other Authors: Chin-Shyurng Fahn
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/898crq
Description
Summary:碩士 === 國立臺灣科技大學 === 資訊工程系 === 107 === In an era of information, Internet volume is a major issue. We automatically classify emotions from multiple long comments through a computer to extract sentiment. The traditional long article summary and sentiment classification methods define a dictionary of sentimental words in advance, but the situation in the real world is more complicated. Under the circumstances, neither the dictionary can fully enumerate every word nor totally define each word. Based on the shortcomings of the above methods, we design a long article analysis system with deep learning. When the model is completely trained, the system will automatically classify reviewers’ sentiment. After collecting lots of reviews, we can determinate the market reaction of today's animation works. According to these, we can achieve the target of extracting sentiment. We propose an automatic sentimental classification method for long comments in the field of animation and a related dataset. First, we use crawlers to fetch a huge number of relevant labeled comments and summarize the articles through pre- processing and Skip-thoughts. Then we use the Bi-GRU combined with self-attention mechanism to train deep recurrent neural networks, and finally complete a sentiment classification model. By integrating the sentimental model into our long comment analysis classification system, we can obtain an ability of extracting emotions from long comments. In addition to the labels and comments, each of the data we collect contains more detailed classifications and information about each of the animation works, which will be the material for future research. However, in the stage of model training, we adopt data enhancement to increase the collected comments and then employ the model to learn the features. Finally, the system can distinguish between positive and negative sentiments. In the experiments, we use public open datasets to evaluate and analyze comments on different types of Japanese animation works, such as Action, Adventure, Comedy, and School. Our system based on the proposed methods can correctly predict in most cases of sentimental classification. In the open datasets, the accuracy in the sentimental classification for the IMDB, SST2, MPQA, and MR are 89.9%, 83.3%, 87.3%, and 86.0%, respectively. Additionally, for our dataset, the accuracy in sentimental classification reaches 84.7% and the overall execution time is very short. It spends about 0.001 seconds on an average per prediction. The experimental results reveal that our system can achieve real-time prediction.