Knowledge integration in machine reading

Machine reading is the artificial-intelligence task of automatically reading a corpus of texts and, from the contents, building a knowledge base that supports automated reasoning and question answering. Success at this task could fundamentally solve the knowledge acquisition bottleneck – the widely r...

Full description

Bibliographic Details
Main Author: Kim, Doo Soon
Format: Others
Language:English
Published: 2011
Subjects:
NLP
Online Access:http://hdl.handle.net/2152/ETD-UT-2011-08-4049
id ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-ETD-UT-2011-08-4049
record_format oai_dc
spelling ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-ETD-UT-2011-08-40492015-09-20T17:04:14ZKnowledge integration in machine readingKim, Doo SoonMachine readingKnowledge integrationArtificial intelligenceText understandingNLPMachine reading is the artificial-intelligence task of automatically reading a corpus of texts and, from the contents, building a knowledge base that supports automated reasoning and question answering. Success at this task could fundamentally solve the knowledge acquisition bottleneck – the widely recognized problem that knowledge-based AI systems are difficult and expensive to build because of the difficulty of acquiring knowledge from authoritative sources and building useful knowledge bases. One challenge inherent in machine reading is knowledge integration – the task of correctly and coherently combining knowledge snippets extracted from texts. This dissertation shows that knowledge integration can be automated and that it can significantly improve the performance of machine reading. We specifically focus on two contributions of knowledge integration. The first contribution is for improving the coherence of learned knowledge bases to better support automated reasoning and question answering. Knowledge integration achieves this benefit by aligning knowledge snippets that contain overlapping content. The alignment is difficult because the snippets can use significantly different surface forms. In one common type of variation, two snippets might contain overlapping content that is expressed at different levels of granularity or detail. Our matcher can “see past” this difference to align knowledge snippets drawn from a single document, from multiple documents, or from a document and a background knowledge base. The second contribution is for improving text interpretation. Our approach is to delay ambiguity resolution to enable a machine-reading system to maintain multiple candidate interpretations. This is useful because typically, as the system reads through texts, evidence accumulates to help the knowledge integration system resolve ambiguities correctly. To avoid a combinatorial explosion in the number of candidate interpretations, we propose the packed representation to compactly encode all the candidates. Also, we present an algorithm that prunes interpretations from the packed representation as evidence accumulates. We evaluate our work by building and testing two prototype machine reading systems and measuring the quality of the knowledge bases they construct. The evaluation shows that our knowledge integration algorithms improve the cohesiveness of the knowledge bases, indicating their improved ability to support automated reasoning and question answering. The evaluation also shows that our approach to postponing ambiguity resolution improves the system’s accuracy at text interpretation.text2011-11-04T19:41:35Z2011-11-04T19:41:35Z2011-082011-11-04August 20112011-11-04T19:41:45Zthesisapplication/pdfhttp://hdl.handle.net/2152/ETD-UT-2011-08-40492152/ETD-UT-2011-08-4049eng
collection NDLTD
language English
format Others
sources NDLTD
topic Machine reading
Knowledge integration
Artificial intelligence
Text understanding
NLP
spellingShingle Machine reading
Knowledge integration
Artificial intelligence
Text understanding
NLP
Kim, Doo Soon
Knowledge integration in machine reading
description Machine reading is the artificial-intelligence task of automatically reading a corpus of texts and, from the contents, building a knowledge base that supports automated reasoning and question answering. Success at this task could fundamentally solve the knowledge acquisition bottleneck – the widely recognized problem that knowledge-based AI systems are difficult and expensive to build because of the difficulty of acquiring knowledge from authoritative sources and building useful knowledge bases. One challenge inherent in machine reading is knowledge integration – the task of correctly and coherently combining knowledge snippets extracted from texts. This dissertation shows that knowledge integration can be automated and that it can significantly improve the performance of machine reading. We specifically focus on two contributions of knowledge integration. The first contribution is for improving the coherence of learned knowledge bases to better support automated reasoning and question answering. Knowledge integration achieves this benefit by aligning knowledge snippets that contain overlapping content. The alignment is difficult because the snippets can use significantly different surface forms. In one common type of variation, two snippets might contain overlapping content that is expressed at different levels of granularity or detail. Our matcher can “see past” this difference to align knowledge snippets drawn from a single document, from multiple documents, or from a document and a background knowledge base. The second contribution is for improving text interpretation. Our approach is to delay ambiguity resolution to enable a machine-reading system to maintain multiple candidate interpretations. This is useful because typically, as the system reads through texts, evidence accumulates to help the knowledge integration system resolve ambiguities correctly. To avoid a combinatorial explosion in the number of candidate interpretations, we propose the packed representation to compactly encode all the candidates. Also, we present an algorithm that prunes interpretations from the packed representation as evidence accumulates. We evaluate our work by building and testing two prototype machine reading systems and measuring the quality of the knowledge bases they construct. The evaluation shows that our knowledge integration algorithms improve the cohesiveness of the knowledge bases, indicating their improved ability to support automated reasoning and question answering. The evaluation also shows that our approach to postponing ambiguity resolution improves the system’s accuracy at text interpretation. === text
author Kim, Doo Soon
author_facet Kim, Doo Soon
author_sort Kim, Doo Soon
title Knowledge integration in machine reading
title_short Knowledge integration in machine reading
title_full Knowledge integration in machine reading
title_fullStr Knowledge integration in machine reading
title_full_unstemmed Knowledge integration in machine reading
title_sort knowledge integration in machine reading
publishDate 2011
url http://hdl.handle.net/2152/ETD-UT-2011-08-4049
work_keys_str_mv AT kimdoosoon knowledgeintegrationinmachinereading
_version_ 1716822123265130496