Tracking the outbreak of diseases Using Twitter : A Machine Learning Approach
In this project I have investigated the correlation between talks of illness on Twitter and the amount of calls to the Swedish medical information services (Sjukvårdsupplysningen). The project has only considered tweets located in Sweden and written in Swedish. In order to fulfill the aim of the pro...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Uppsala universitet, Institutionen för informationsteknologi
2012
|
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-180183 |
Summary: | In this project I have investigated the correlation between talks of illness on Twitter and the amount of calls to the Swedish medical information services (Sjukvårdsupplysningen). The project has only considered tweets located in Sweden and written in Swedish. In order to fulfill the aim of the project I used a SVM-classifier trained on 20,000 tweets manually marked as indicating sickness or not indicative of sickness. The resulting classifier was then used on roughly half a million tweets collected during the spring of 2012. The results were correlated with data from the Swedish medical information services. I was able to show a Pearson correlation of 0.8707051, p = 0.00225 when compared with weekly values from the medical information services. I also use an ets-model fitted to the twitter data to try to predict future values. However I have not been able to evaluate the accuracy of these predictions. |
---|