Summary: | 碩士 === 國立中山大學 === 財務管理學系研究所 === 107 === Nowadays, online news has become one of the judgments for investors to make investment decisions. However, a large amount of information generated by financial news websites everyday makes investors unable to use traditional human reading and screening methods to judge and verify the current market sentiment reflected by each news report.
In order to help investors understand the current market sentiment quickly, we use the techniques of text mining and text classification to classify new sentiments. This study collects Taiwanese stock market news of Anue Financial News and use different methods of text pre-processing and classifier to achieve the best classification performance.
The empirical results show:(1) N-gram feature extraction can improve the accuracy of all classifiers, especially the naive Bayes classifier which can effectively overcome shortcomings of the independence assumptions. (2) TF-IDF feature selection only effective for naive Bayes classifier. Under the circumstances of the number of words decreasing, it can improve the accuracy and reduce the training time. (3) The Chi-square test and mutual information feature selection can improve the accuracy of both fastText and Multi-layer Perceptron. Furthermore, the combination of Chi-square test feature and fastText achieved the best performance in this study.
|