Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices
Stock market analysis is one of the biggest areas of interest for text mining. Many researchers proposed different approaches that use text information for predicting the movement of stock market indices. Many of these approaches focus either on maximising the predictive accuracy of the model or...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Primorska
2017-06-01
|
Series: | Managing Global Transitions |
Subjects: | |
Online Access: | http://www.hippocampus.si/ISSN/1854-6935/15.109-121.pdf |
id |
doaj-f7f730df7c9a4645bd5961088f58806b |
---|---|
record_format |
Article |
spelling |
doaj-f7f730df7c9a4645bd5961088f58806b2020-11-24T21:48:36ZengUniversity of PrimorskaManaging Global Transitions1581-63111854-69352017-06-0115210912110.26493/1854-6935.15.109-121Using Words from Daily News Headlines to Predict the Movement of Stock Market IndicesBranko Kavšek0University of Primorska, SloveniaStock market analysis is one of the biggest areas of interest for text mining. Many researchers proposed different approaches that use text information for predicting the movement of stock market indices. Many of these approaches focus either on maximising the predictive accuracy of the model or on devising alternative methods for model evaluation. In this paper, we propose a more descriptive approach focusing on the models themselves, trying to identify the individual words in the text that most affect the movement of stock market indices. We use data from two sources (for the past eight years): the daily data for the Dow Jones Industrial Average index (‘open’ and ‘close’ values for each trading day) and the headlines of the most voted 25 news on the Reddit World News Channel for the previous ‘trading days.’ By applying machine learning algorithms on these data and analysing individual words that appear in the final predictive models, we find that the words gay, propaganda and massacre are typically associated with a daily increase of the stock index, while the word IRAN mostly coincide with its decrease. While this work presents a first step towards qualitative analysis of stock market models, there is still plenty of room for improvements.http://www.hippocampus.si/ISSN/1854-6935/15.109-121.pdfstock marketstext miningmachine learningpredictive modellingnatural language processing |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Branko Kavšek |
spellingShingle |
Branko Kavšek Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices Managing Global Transitions stock markets text mining machine learning predictive modelling natural language processing |
author_facet |
Branko Kavšek |
author_sort |
Branko Kavšek |
title |
Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices |
title_short |
Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices |
title_full |
Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices |
title_fullStr |
Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices |
title_full_unstemmed |
Using Words from Daily News Headlines to Predict the Movement of Stock Market Indices |
title_sort |
using words from daily news headlines to predict the movement of stock market indices |
publisher |
University of Primorska |
series |
Managing Global Transitions |
issn |
1581-6311 1854-6935 |
publishDate |
2017-06-01 |
description |
Stock market analysis is one of the biggest areas of interest for text mining.
Many researchers proposed different approaches that use text information
for predicting the movement of stock market indices. Many of these approaches
focus either on maximising the predictive accuracy of the model
or on devising alternative methods for model evaluation. In this paper,
we propose a more descriptive approach focusing on the models themselves,
trying to identify the individual words in the text that most affect
the movement of stock market indices. We use data from two sources (for
the past eight years): the daily data for the Dow Jones Industrial Average
index (‘open’ and ‘close’ values for each trading day) and the headlines of
the most voted 25 news on the Reddit World News Channel for the previous
‘trading days.’ By applying machine learning algorithms on these data
and analysing individual words that appear in the final predictive models,
we find that the words gay, propaganda and massacre are typically associated
with a daily increase of the stock index, while the word IRAN mostly
coincide with its decrease. While this work presents a first step towards
qualitative analysis of stock market models, there is still plenty of room for
improvements. |
topic |
stock markets text mining machine learning predictive modelling natural language processing |
url |
http://www.hippocampus.si/ISSN/1854-6935/15.109-121.pdf |
work_keys_str_mv |
AT brankokavsek usingwordsfromdailynewsheadlinestopredictthemovementofstockmarketindices |
_version_ |
1725891272560869376 |