id ndltd-OhioLink-oai-etd.ohiolink.edu-osu153200096922783
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-osu1532000969227832021-08-03T07:07:43Z Predicting Knowledge Base Revisions from Realtime Text Streams Konovalov, Alexander Computer Science Computer Engineering knowledge bases event extraction social media distant supervision database revision history Broad-coverage knowledge bases (KBs), such as Freebase, NELL, DBPedia, and Wikidata, Microsoft's Satori and Google's Knowledge Graph contain large collections of structured facts about things, people, places, and events happening in the world. These KBs have become increasingly important for a wide range of intelligent systems: from information retrieval and question answering, to Facebook's Graph Search, IBM's Watson, Google Home, and more. Previous work on learning to populate knowledge bases from text has, for the most part, made the simplifying assumption that facts remain constant over time. But this is inaccurate -- we live in a rapidly changing world. Knowledge should not be viewed as a static snapshot, but instead a rapidly evolving set of facts that must change as the world changes.In this thesis we demonstrate the feasibility of accurately identifying entity-transition-events from real-time news and social media text streams, that drive changes to a knowledge base. We use Wikipedia's revision history for distant supervision to learn event extractors, and evaluate the extractors based on their ability to predict online updates. Our weakly supervised event extractors are able to predict 10 KB revisions per month at 0.8 precision. By lowering our confidence threshold, we can suggest 34.3 correct edits per month at 0.4 precision. 64% of predicted edits were detected before they were added to Wikipedia. The average lead time of our forecasted knowledge revisions over Wikipedia's editors is 40 days, demonstrating the utility of our method for suggesting edits that can be quickly verified and added to the knowledge graph. 2018-12-20 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu153200096922783 http://rave.ohiolink.edu/etdc/view?acc_num=osu153200096922783 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Computer Science
Computer Engineering
knowledge bases
event extraction
social media
distant supervision
database revision history
spellingShingle Computer Science
Computer Engineering
knowledge bases
event extraction
social media
distant supervision
database revision history
Konovalov, Alexander
Predicting Knowledge Base Revisions from Realtime Text Streams
author Konovalov, Alexander
author_facet Konovalov, Alexander
author_sort Konovalov, Alexander
title Predicting Knowledge Base Revisions from Realtime Text Streams
title_short Predicting Knowledge Base Revisions from Realtime Text Streams
title_full Predicting Knowledge Base Revisions from Realtime Text Streams
title_fullStr Predicting Knowledge Base Revisions from Realtime Text Streams
title_full_unstemmed Predicting Knowledge Base Revisions from Realtime Text Streams
title_sort predicting knowledge base revisions from realtime text streams
publisher The Ohio State University / OhioLINK
publishDate 2018
url http://rave.ohiolink.edu/etdc/view?acc_num=osu153200096922783
work_keys_str_mv AT konovalovalexander predictingknowledgebaserevisionsfromrealtimetextstreams
_version_ 1719454482653249536