Making diffusion work for you: Classification sans text, finding culprits and filling missing values

Can we find people infected with the flu virus even though they did not visit a doctor? Can the temporal features of a trending hashtag or a keyword indicate which topic it belongs to without any textual information? Given a history of interactions between blogs and news websites, can we predict blo...

Full description

Bibliographic Details
Main Author:	Sundareisan, Shashidhar
Other Authors:	Computer Science
Format:	Others
Published:	Virginia Tech 2014
Subjects:	Data Mining Social Networks Epidemiology Culprits Missing nodes Diffusion Protests Classification
Online Access:	http://hdl.handle.net/10919/49678

id	ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-49678
record_format	oai_dc
spelling	ndltd-VTETD-oai-vtechworks.lib.vt.edu-10919-496782021-02-02T05:32:37Z Making diffusion work for you: Classification sans text, finding culprits and filling missing values Sundareisan, Shashidhar Computer Science Prakash, B. Aditya Batra, Dhruv Ramakrishnan, Naren Data Mining Social Networks Epidemiology Culprits Missing nodes Diffusion Protests Classification Can we find people infected with the flu virus even though they did not visit a doctor? Can the temporal features of a trending hashtag or a keyword indicate which topic it belongs to without any textual information? Given a history of interactions between blogs and news websites, can we predict blogs posts/news websites that are not in the sample but talk about the ``the state of the economy'' in 2008? These questions have two things in common: a network (social networks or human contact networks) and a virus (meme, keyword or the flu virus) diffusing over the network. We can think of interactions like memes, hashtags, influenza infections, computer viruses etc., as viruses spreading in a network. This treatment allows for the usage of epidemiologically inspired models to study or model these interactions. Understanding the complex propagation dynamics involved in information diffusion with the help of these models uncovers various non-trivial and interesting results. In this thesis we propose (a) A fast and efficient algorithm NetFill, which can be used to find quantitatively and qualitatively correct infected nodes, not in the sample and finding the culprits and (b) A method, SansText that can be used to find out which topic a keyword/hashtag belongs to just by looking at the popularity graph of the keyword without textual analysis. The results derived in this thesis can be used in various areas like epidemiology, news and protest detection, viral marketing and it can also be used to reduce sampling errors in graphs. Master of Science 2014-07-25T08:00:49Z 2014-07-25T08:00:49Z 2014-07-24 Thesis vt_gsexam:3444 http://hdl.handle.net/10919/49678 In Copyright http://rightsstatements.org/vocab/InC/1.0/ ETD application/pdf Virginia Tech
collection	NDLTD
format	Others
sources	NDLTD
topic	Data Mining Social Networks Epidemiology Culprits Missing nodes Diffusion Protests Classification
spellingShingle	Data Mining Social Networks Epidemiology Culprits Missing nodes Diffusion Protests Classification Sundareisan, Shashidhar Making diffusion work for you: Classification sans text, finding culprits and filling missing values
description	Can we find people infected with the flu virus even though they did not visit a doctor? Can the temporal features of a trending hashtag or a keyword indicate which topic it belongs to without any textual information? Given a history of interactions between blogs and news websites, can we predict blogs posts/news websites that are not in the sample but talk about the ``the state of the economy'' in 2008? These questions have two things in common: a network (social networks or human contact networks) and a virus (meme, keyword or the flu virus) diffusing over the network. We can think of interactions like memes, hashtags, influenza infections, computer viruses etc., as viruses spreading in a network. This treatment allows for the usage of epidemiologically inspired models to study or model these interactions. Understanding the complex propagation dynamics involved in information diffusion with the help of these models uncovers various non-trivial and interesting results. In this thesis we propose (a) A fast and efficient algorithm NetFill, which can be used to find quantitatively and qualitatively correct infected nodes, not in the sample and finding the culprits and (b) A method, SansText that can be used to find out which topic a keyword/hashtag belongs to just by looking at the popularity graph of the keyword without textual analysis. The results derived in this thesis can be used in various areas like epidemiology, news and protest detection, viral marketing and it can also be used to reduce sampling errors in graphs. === Master of Science
author2	Computer Science
author_facet	Computer Science Sundareisan, Shashidhar
author	Sundareisan, Shashidhar
author_sort	Sundareisan, Shashidhar
title	Making diffusion work for you: Classification sans text, finding culprits and filling missing values
title_short	Making diffusion work for you: Classification sans text, finding culprits and filling missing values
title_full	Making diffusion work for you: Classification sans text, finding culprits and filling missing values
title_fullStr	Making diffusion work for you: Classification sans text, finding culprits and filling missing values
title_full_unstemmed	Making diffusion work for you: Classification sans text, finding culprits and filling missing values
title_sort	making diffusion work for you: classification sans text, finding culprits and filling missing values
publisher	Virginia Tech
publishDate	2014
url	http://hdl.handle.net/10919/49678
work_keys_str_mv	AT sundareisanshashidhar makingdiffusionworkforyouclassificationsanstextfindingculpritsandfillingmissingvalues
_version_	1719375080361820160

Making diffusion work for you: Classification sans text, finding culprits and filling missing values

Similar Items