Tracking the outbreak of diseases Using Twitter : A Machine Learning Approach
In this project I have investigated the correlation between talks of illness on Twitter and the amount of calls to the Swedish medical information services (Sjukvårdsupplysningen). The project has only considered tweets located in Sweden and written in Swedish. In order to fulfill the aim of the pro...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Uppsala universitet, Institutionen för informationsteknologi
2012
|
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-180183 |
id |
ndltd-UPSALLA1-oai-DiVA.org-uu-180183 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-uu-1801832013-01-08T13:52:44ZTracking the outbreak of diseases Using Twitter : A Machine Learning ApproachengBohlin, ErikUppsala universitet, Institutionen för informationsteknologi2012In this project I have investigated the correlation between talks of illness on Twitter and the amount of calls to the Swedish medical information services (Sjukvårdsupplysningen). The project has only considered tweets located in Sweden and written in Swedish. In order to fulfill the aim of the project I used a SVM-classifier trained on 20,000 tweets manually marked as indicating sickness or not indicative of sickness. The resulting classifier was then used on roughly half a million tweets collected during the spring of 2012. The results were correlated with data from the Swedish medical information services. I was able to show a Pearson correlation of 0.8707051, p = 0.00225 when compared with weekly values from the medical information services. I also use an ets-model fitted to the twitter data to try to predict future values. However I have not been able to evaluate the accuracy of these predictions. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-180183UPTEC IT, 1401-5749 ; 12 014application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
description |
In this project I have investigated the correlation between talks of illness on Twitter and the amount of calls to the Swedish medical information services (Sjukvårdsupplysningen). The project has only considered tweets located in Sweden and written in Swedish. In order to fulfill the aim of the project I used a SVM-classifier trained on 20,000 tweets manually marked as indicating sickness or not indicative of sickness. The resulting classifier was then used on roughly half a million tweets collected during the spring of 2012. The results were correlated with data from the Swedish medical information services. I was able to show a Pearson correlation of 0.8707051, p = 0.00225 when compared with weekly values from the medical information services. I also use an ets-model fitted to the twitter data to try to predict future values. However I have not been able to evaluate the accuracy of these predictions. |
author |
Bohlin, Erik |
spellingShingle |
Bohlin, Erik Tracking the outbreak of diseases Using Twitter : A Machine Learning Approach |
author_facet |
Bohlin, Erik |
author_sort |
Bohlin, Erik |
title |
Tracking the outbreak of diseases Using Twitter : A Machine Learning Approach |
title_short |
Tracking the outbreak of diseases Using Twitter : A Machine Learning Approach |
title_full |
Tracking the outbreak of diseases Using Twitter : A Machine Learning Approach |
title_fullStr |
Tracking the outbreak of diseases Using Twitter : A Machine Learning Approach |
title_full_unstemmed |
Tracking the outbreak of diseases Using Twitter : A Machine Learning Approach |
title_sort |
tracking the outbreak of diseases using twitter : a machine learning approach |
publisher |
Uppsala universitet, Institutionen för informationsteknologi |
publishDate |
2012 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-180183 |
work_keys_str_mv |
AT bohlinerik trackingtheoutbreakofdiseasesusingtwitteramachinelearningapproach |
_version_ |
1716531714010906624 |