Extracting Events from Social Media and the Web

Bibliographic Details
Main Author: Zong, Shi
Language:English
Published: The Ohio State University / OhioLINK 2020
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=osu1606487414850247
id ndltd-OhioLink-oai-etd.ohiolink.edu-osu1606487414850247
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-osu16064874148502472021-10-16T05:25:16Z Extracting Events from Social Media and the Web Zong, Shi Computer Science The last decade has witnessed a tremendous advance of technology and has led to an explosion of user-generated text, including the web and short informal texts in microblogs such as Twitter. It motivates the need for automatic text processing techniques to extract, aggregate and analyze this huge amount of information that no one could handle manually.In this thesis, we present three efforts of using computational linguistics approaches to extract events from social media and the web. For each study, we develop resources and models for extracting structured information from unstructured data. We then demonstrate the value of analyzing these extracted events. In the first study, we analyze the perceived severity of cybersecurity threats reported on social media. We build a sensor that could automatically scan tweets mentioning cybersecurity threats and evaluate the threats severity based on language used to describe them. Our experimental results show that our predicted severity scores are correlated with actual scores in National Vulnerability Database (NVD) from experts and can be used as an indicator for whether a threat will be exploitable in the wild. In the second study, we study people's linguistic behavior when they make predictions about future events. We extract people's predictions from geopolitical and financial domains and investigate a number of linguistic metrics over people’s justifications of their predictions. We further demonstrate the possibility of accurately predicting forecasting skills using a model that is based solely on language. In the third study, we present a corpus that could be used for automatically extracting COVID-19 related events from Twitter. Based on our newly manually annotated dataset, we build a semantic search system that allows users to search a variety of information by using different queries related to COVID-19, such as "Who tested positive that has close contact with Boris Johnson?" or "What are the cure methods that people think effective?" We believe this semantic search system could help address the information overload for professionals who want to stay on top of recent developments related to COVID-19. 2020 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1606487414850247 http://rave.ohiolink.edu/etdc/view?acc_num=osu1606487414850247 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Computer Science
spellingShingle Computer Science
Zong, Shi
Extracting Events from Social Media and the Web
author Zong, Shi
author_facet Zong, Shi
author_sort Zong, Shi
title Extracting Events from Social Media and the Web
title_short Extracting Events from Social Media and the Web
title_full Extracting Events from Social Media and the Web
title_fullStr Extracting Events from Social Media and the Web
title_full_unstemmed Extracting Events from Social Media and the Web
title_sort extracting events from social media and the web
publisher The Ohio State University / OhioLINK
publishDate 2020
url http://rave.ohiolink.edu/etdc/view?acc_num=osu1606487414850247
work_keys_str_mv AT zongshi extractingeventsfromsocialmediaandtheweb
_version_ 1719489976177000448