Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices

Stock market analysis is one of the biggest areas of interest for text mining. Many researchers proposed different approaches that use text information for predicting the movement of stock market indices. Many of these approaches focus either on maximising the predictive accuracy of the model or...

Full description

Bibliographic Details
Main Author: Branko Kavšek
Format: Article
Language:English
Published: University of Primorska 2017-06-01
Series:Managing Global Transitions
Subjects:
Online Access:http://www.hippocampus.si/ISSN/1854-6935/15.109-121.pdf
Description
Summary:Stock market analysis is one of the biggest areas of interest for text mining. Many researchers proposed different approaches that use text information for predicting the movement of stock market indices. Many of these approaches focus either on maximising the predictive accuracy of the model or on devising alternative methods for model evaluation. In this paper, we propose a more descriptive approach focusing on the models themselves, trying to identify the individual words in the text that most affect the movement of stock market indices. We use data from two sources (for the past eight years): the daily data for the Dow Jones Industrial Average index (‘open’ and ‘close’ values for each trading day) and the headlines of the most voted 25 news on the Reddit World News Channel for the previous ‘trading days.’ By applying machine learning algorithms on these data and analysing individual words that appear in the final predictive models, we find that the words gay, propaganda and massacre are typically associated with a daily increase of the stock index, while the word IRAN mostly coincide with its decrease. While this work presents a first step towards qualitative analysis of stock market models, there is still plenty of room for improvements.
ISSN:1581-6311
1854-6935