Summary: | 碩士 === 東吳大學 === 財務工程與精算數學系 === 105 === The paper explores the relationship between headlines of financial news and TAIEX volatility based on the technology of text mining and machine learning. In fact, there is a feature due to the conversion from the headlines of financial news to be a huge dimension by the technology of text mining. Spark owns the characteristic of in-memory computing, and that is the reason why Spark has a higher speed than Hadoop MapReduce when handling big data, so this paper uses Spark as an analysis tool. The paper is based on two subjects, one is the daily headlines of financial news in a newspaper from 2010 to 2016, and another is the daily TAIEX from TEJ. There are two categories through converting from TAIEX to real volatility. In addition, the headlines of financial news are transformed to features by Spark’s text mining tool. Lastly, go establishing a forecasting model between the headlines of financial news and the classification of real volatility by Spark’s machine learning tool, and then go evaluating the performance of the forecasting model.
|