Mining User-generated Content for Insights
The proliferation of social media, such as blogs, micro-blogs and social networks, has led to a plethora of readily available user-generated content. The latter offers a unique, uncensored window into emerging stories and events, ranging from politics and revolutions to product perception and the ze...
Main Author: | |
---|---|
Other Authors: | |
Language: | en_ca |
Published: |
2012
|
Subjects: | |
Online Access: | http://hdl.handle.net/1807/32650 |
id |
ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-32650 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-326502013-04-19T19:57:32ZMining User-generated Content for InsightsAngel, Albert-Davidalgorithmsdata mininginformation retrievaluser-generated content0984The proliferation of social media, such as blogs, micro-blogs and social networks, has led to a plethora of readily available user-generated content. The latter offers a unique, uncensored window into emerging stories and events, ranging from politics and revolutions to product perception and the zeitgeist. Importantly, structured information is available for user-generated content, by dint of its metadata, or can be surfaced via recently commoditized information extraction tools. This wealth of information, in the form of real-world entities and facts mentioned in a document, author demographics, and so on, provides exciting opportunities for mining insights from this content. Capitalizing upon these, we develop Grapevine, an online system that distills information from the social media collective on a daily basis, and facilitates its interactive exploration. To further this goal, we address important research problems, which are also of independent interest. The sheer scale of the data being processed, necessitates that our solutions be highly efficient. We propose efficient techniques for mining important stories, on a per-user-demographic basis, based on named entity co-occurrences in user-generated content. Building upon these, we propose efficient techniques for identifying emerging stories as-they-happen, by identifying dense structures in an evolving entity graph. To facilitate the exploration of these stories, we propose efficient techniques for filtering them, based on users’ textual descriptions of the entities involved. These gathered insights need to be presented to users in a useful manner, via a diverse set of representative documents; we thus propose efficient techniques for addressing this problem. Recommending related stories to users is important for navigation purposes. As the way in which these are related to the story being explored is not always clear, we propose efficient techniques for generating recommendation explanations via entity relatedness queries.Koudas, Nick2012-062012-08-20T15:06:29ZNO_RESTRICTION2012-08-20T15:06:29Z2012-08-20Thesishttp://hdl.handle.net/1807/32650en_ca |
collection |
NDLTD |
language |
en_ca |
sources |
NDLTD |
topic |
algorithms data mining information retrieval user-generated content 0984 |
spellingShingle |
algorithms data mining information retrieval user-generated content 0984 Angel, Albert-David Mining User-generated Content for Insights |
description |
The proliferation of social media, such as blogs, micro-blogs and social networks, has led to a plethora of readily available user-generated content. The latter offers a unique, uncensored window into emerging stories and events, ranging from politics and revolutions to product perception and the zeitgeist.
Importantly, structured information is available for user-generated content, by dint of its metadata, or can be surfaced via recently commoditized information extraction tools. This wealth of information, in the form of real-world entities and facts mentioned in a document, author demographics, and so on, provides exciting opportunities for mining insights from this content.
Capitalizing upon these, we develop Grapevine, an online system that distills information from the social media collective on a daily basis, and facilitates its interactive exploration. To further this goal, we address important research problems, which are also of independent interest. The sheer scale of the data being processed, necessitates that our solutions be highly efficient.
We propose efficient techniques for mining important stories, on a per-user-demographic basis, based on named entity co-occurrences in user-generated content. Building upon these, we propose efficient techniques for identifying emerging stories as-they-happen, by identifying dense structures in an evolving entity graph.
To facilitate the exploration of these stories, we propose efficient techniques for filtering them, based on users’ textual descriptions of the entities involved.
These gathered insights need to be presented to users in a useful manner, via a diverse set of representative documents; we thus propose efficient techniques for addressing this problem.
Recommending related stories to users is important for navigation purposes. As the way in which these are related to the story being explored is not always clear, we propose efficient techniques for generating recommendation explanations via entity relatedness queries. |
author2 |
Koudas, Nick |
author_facet |
Koudas, Nick Angel, Albert-David |
author |
Angel, Albert-David |
author_sort |
Angel, Albert-David |
title |
Mining User-generated Content for Insights |
title_short |
Mining User-generated Content for Insights |
title_full |
Mining User-generated Content for Insights |
title_fullStr |
Mining User-generated Content for Insights |
title_full_unstemmed |
Mining User-generated Content for Insights |
title_sort |
mining user-generated content for insights |
publishDate |
2012 |
url |
http://hdl.handle.net/1807/32650 |
work_keys_str_mv |
AT angelalbertdavid miningusergeneratedcontentforinsights |
_version_ |
1716582180258316288 |