Mining User-generated Content for Insights

The proliferation of social media, such as blogs, micro-blogs and social networks, has led to a plethora of readily available user-generated content. The latter offers a unique, uncensored window into emerging stories and events, ranging from politics and revolutions to product perception and the ze...

Full description

Bibliographic Details
Main Author: Angel, Albert-David
Other Authors: Koudas, Nick
Language:en_ca
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/1807/32650
id ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-32650
record_format oai_dc
spelling ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-326502013-04-19T19:57:32ZMining User-generated Content for InsightsAngel, Albert-Davidalgorithmsdata mininginformation retrievaluser-generated content0984The proliferation of social media, such as blogs, micro-blogs and social networks, has led to a plethora of readily available user-generated content. The latter offers a unique, uncensored window into emerging stories and events, ranging from politics and revolutions to product perception and the zeitgeist. Importantly, structured information is available for user-generated content, by dint of its metadata, or can be surfaced via recently commoditized information extraction tools. This wealth of information, in the form of real-world entities and facts mentioned in a document, author demographics, and so on, provides exciting opportunities for mining insights from this content. Capitalizing upon these, we develop Grapevine, an online system that distills information from the social media collective on a daily basis, and facilitates its interactive exploration. To further this goal, we address important research problems, which are also of independent interest. The sheer scale of the data being processed, necessitates that our solutions be highly efficient. We propose efficient techniques for mining important stories, on a per-user-demographic basis, based on named entity co-occurrences in user-generated content. Building upon these, we propose efficient techniques for identifying emerging stories as-they-happen, by identifying dense structures in an evolving entity graph. To facilitate the exploration of these stories, we propose efficient techniques for filtering them, based on users’ textual descriptions of the entities involved. These gathered insights need to be presented to users in a useful manner, via a diverse set of representative documents; we thus propose efficient techniques for addressing this problem. Recommending related stories to users is important for navigation purposes. As the way in which these are related to the story being explored is not always clear, we propose efficient techniques for generating recommendation explanations via entity relatedness queries.Koudas, Nick2012-062012-08-20T15:06:29ZNO_RESTRICTION2012-08-20T15:06:29Z2012-08-20Thesishttp://hdl.handle.net/1807/32650en_ca
collection NDLTD
language en_ca
sources NDLTD
topic algorithms
data mining
information retrieval
user-generated content
0984
spellingShingle algorithms
data mining
information retrieval
user-generated content
0984
Angel, Albert-David
Mining User-generated Content for Insights
description The proliferation of social media, such as blogs, micro-blogs and social networks, has led to a plethora of readily available user-generated content. The latter offers a unique, uncensored window into emerging stories and events, ranging from politics and revolutions to product perception and the zeitgeist. Importantly, structured information is available for user-generated content, by dint of its metadata, or can be surfaced via recently commoditized information extraction tools. This wealth of information, in the form of real-world entities and facts mentioned in a document, author demographics, and so on, provides exciting opportunities for mining insights from this content. Capitalizing upon these, we develop Grapevine, an online system that distills information from the social media collective on a daily basis, and facilitates its interactive exploration. To further this goal, we address important research problems, which are also of independent interest. The sheer scale of the data being processed, necessitates that our solutions be highly efficient. We propose efficient techniques for mining important stories, on a per-user-demographic basis, based on named entity co-occurrences in user-generated content. Building upon these, we propose efficient techniques for identifying emerging stories as-they-happen, by identifying dense structures in an evolving entity graph. To facilitate the exploration of these stories, we propose efficient techniques for filtering them, based on users’ textual descriptions of the entities involved. These gathered insights need to be presented to users in a useful manner, via a diverse set of representative documents; we thus propose efficient techniques for addressing this problem. Recommending related stories to users is important for navigation purposes. As the way in which these are related to the story being explored is not always clear, we propose efficient techniques for generating recommendation explanations via entity relatedness queries.
author2 Koudas, Nick
author_facet Koudas, Nick
Angel, Albert-David
author Angel, Albert-David
author_sort Angel, Albert-David
title Mining User-generated Content for Insights
title_short Mining User-generated Content for Insights
title_full Mining User-generated Content for Insights
title_fullStr Mining User-generated Content for Insights
title_full_unstemmed Mining User-generated Content for Insights
title_sort mining user-generated content for insights
publishDate 2012
url http://hdl.handle.net/1807/32650
work_keys_str_mv AT angelalbertdavid miningusergeneratedcontentforinsights
_version_ 1716582180258316288