Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices

Stock market analysis is one of the biggest areas of interest for text mining. Many researchers proposed different approaches that use text information for predicting the movement of stock market indices. Many of these approaches focus either on maximising the predictive accuracy of the model or...

Full description

Bibliographic Details
Main Author: Branko Kavšek
Format: Article
Language:English
Published: University of Primorska 2017-06-01
Series:Managing Global Transitions
Subjects:
Online Access:http://www.hippocampus.si/ISSN/1854-6935/15.109-121.pdf
id doaj-f7f730df7c9a4645bd5961088f58806b
record_format Article
spelling doaj-f7f730df7c9a4645bd5961088f58806b2020-11-24T21:48:36ZengUniversity of PrimorskaManaging Global Transitions1581-63111854-69352017-06-0115210912110.26493/1854-6935.15.109-121Using Words from Daily News Headlines to Predict the Movement of Stock Market IndicesBranko Kavšek0University of Primorska, SloveniaStock market analysis is one of the biggest areas of interest for text mining. Many researchers proposed different approaches that use text information for predicting the movement of stock market indices. Many of these approaches focus either on maximising the predictive accuracy of the model or on devising alternative methods for model evaluation. In this paper, we propose a more descriptive approach focusing on the models themselves, trying to identify the individual words in the text that most affect the movement of stock market indices. We use data from two sources (for the past eight years): the daily data for the Dow Jones Industrial Average index (‘open’ and ‘close’ values for each trading day) and the headlines of the most voted 25 news on the Reddit World News Channel for the previous ‘trading days.’ By applying machine learning algorithms on these data and analysing individual words that appear in the final predictive models, we find that the words gay, propaganda and massacre are typically associated with a daily increase of the stock index, while the word IRAN mostly coincide with its decrease. While this work presents a first step towards qualitative analysis of stock market models, there is still plenty of room for improvements.http://www.hippocampus.si/ISSN/1854-6935/15.109-121.pdfstock marketstext miningmachine learningpredictive modellingnatural language processing
collection DOAJ
language English
format Article
sources DOAJ
author Branko Kavšek
spellingShingle Branko Kavšek
Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices
Managing Global Transitions
stock markets
text mining
machine learning
predictive modelling
natural language processing
author_facet Branko Kavšek
author_sort Branko Kavšek
title Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices
title_short Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices
title_full Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices
title_fullStr Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices
title_full_unstemmed Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices
title_sort using words from daily news headlines to predict the movement of stock market indices
publisher University of Primorska
series Managing Global Transitions
issn 1581-6311
1854-6935
publishDate 2017-06-01
description Stock market analysis is one of the biggest areas of interest for text mining. Many researchers proposed different approaches that use text information for predicting the movement of stock market indices. Many of these approaches focus either on maximising the predictive accuracy of the model or on devising alternative methods for model evaluation. In this paper, we propose a more descriptive approach focusing on the models themselves, trying to identify the individual words in the text that most affect the movement of stock market indices. We use data from two sources (for the past eight years): the daily data for the Dow Jones Industrial Average index (‘open’ and ‘close’ values for each trading day) and the headlines of the most voted 25 news on the Reddit World News Channel for the previous ‘trading days.’ By applying machine learning algorithms on these data and analysing individual words that appear in the final predictive models, we find that the words gay, propaganda and massacre are typically associated with a daily increase of the stock index, while the word IRAN mostly coincide with its decrease. While this work presents a first step towards qualitative analysis of stock market models, there is still plenty of room for improvements.
topic stock markets
text mining
machine learning
predictive modelling
natural language processing
url http://www.hippocampus.si/ISSN/1854-6935/15.109-121.pdf
work_keys_str_mv AT brankokavsek usingwordsfromdailynewsheadlinestopredictthemovementofstockmarketindices
_version_ 1725891272560869376