Knowledge integration in machine reading

Machine reading is the artiﬁcial-intelligence task of automatically reading a corpus of texts and, from the contents, building a knowledge base that supports automated reasoning and question answering. Success at this task could fundamentally solve the knowledge acquisition bottleneck – the widely r...

Full description

Bibliographic Details
Main Author:	Kim, Doo Soon
Format:	Others
Language:	English
Published:	2011
Subjects:	Machine reading Knowledge integration Artificial intelligence Text understanding NLP
Online Access:	http://hdl.handle.net/2152/ETD-UT-2011-08-4049

id	ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-ETD-UT-2011-08-4049
record_format	oai_dc
spelling	ndltd-UTEXAS-oai-repositories.lib.utexas.edu-2152-ETD-UT-2011-08-40492015-09-20T17:04:14ZKnowledge integration in machine readingKim, Doo SoonMachine readingKnowledge integrationArtificial intelligenceText understandingNLPMachine reading is the artiﬁcial-intelligence task of automatically reading a corpus of texts and, from the contents, building a knowledge base that supports automated reasoning and question answering. Success at this task could fundamentally solve the knowledge acquisition bottleneck – the widely recognized problem that knowledge-based AI systems are diﬃcult and expensive to build because of the diﬃculty of acquiring knowledge from authoritative sources and building useful knowledge bases. One challenge inherent in machine reading is knowledge integration – the task of correctly and coherently combining knowledge snippets extracted from texts. This dissertation shows that knowledge integration can be automated and that it can signiﬁcantly improve the performance of machine reading. We speciﬁcally focus on two contributions of knowledge integration. The ﬁrst contribution is for improving the coherence of learned knowledge bases to better support automated reasoning and question answering. Knowledge integration achieves this beneﬁt by aligning knowledge snippets that contain overlapping content. The alignment is diﬃcult because the snippets can use signiﬁcantly diﬀerent surface forms. In one common type of variation, two snippets might contain overlapping content that is expressed at diﬀerent levels of granularity or detail. Our matcher can “see past” this diﬀerence to align knowledge snippets drawn from a single document, from multiple documents, or from a document and a background knowledge base. The second contribution is for improving text interpretation. Our approach is to delay ambiguity resolution to enable a machine-reading system to maintain multiple candidate interpretations. This is useful because typically, as the system reads through texts, evidence accumulates to help the knowledge integration system resolve ambiguities correctly. To avoid a combinatorial explosion in the number of candidate interpretations, we propose the packed representation to compactly encode all the candidates. Also, we present an algorithm that prunes interpretations from the packed representation as evidence accumulates. We evaluate our work by building and testing two prototype machine reading systems and measuring the quality of the knowledge bases they construct. The evaluation shows that our knowledge integration algorithms improve the cohesiveness of the knowledge bases, indicating their improved ability to support automated reasoning and question answering. The evaluation also shows that our approach to postponing ambiguity resolution improves the system’s accuracy at text interpretation.text2011-11-04T19:41:35Z2011-11-04T19:41:35Z2011-082011-11-04August 20112011-11-04T19:41:45Zthesisapplication/pdfhttp://hdl.handle.net/2152/ETD-UT-2011-08-40492152/ETD-UT-2011-08-4049eng
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Machine reading Knowledge integration Artificial intelligence Text understanding NLP
spellingShingle	Machine reading Knowledge integration Artificial intelligence Text understanding NLP Kim, Doo Soon Knowledge integration in machine reading
description	Machine reading is the artiﬁcial-intelligence task of automatically reading a corpus of texts and, from the contents, building a knowledge base that supports automated reasoning and question answering. Success at this task could fundamentally solve the knowledge acquisition bottleneck – the widely recognized problem that knowledge-based AI systems are diﬃcult and expensive to build because of the diﬃculty of acquiring knowledge from authoritative sources and building useful knowledge bases. One challenge inherent in machine reading is knowledge integration – the task of correctly and coherently combining knowledge snippets extracted from texts. This dissertation shows that knowledge integration can be automated and that it can signiﬁcantly improve the performance of machine reading. We speciﬁcally focus on two contributions of knowledge integration. The ﬁrst contribution is for improving the coherence of learned knowledge bases to better support automated reasoning and question answering. Knowledge integration achieves this beneﬁt by aligning knowledge snippets that contain overlapping content. The alignment is diﬃcult because the snippets can use signiﬁcantly diﬀerent surface forms. In one common type of variation, two snippets might contain overlapping content that is expressed at diﬀerent levels of granularity or detail. Our matcher can “see past” this diﬀerence to align knowledge snippets drawn from a single document, from multiple documents, or from a document and a background knowledge base. The second contribution is for improving text interpretation. Our approach is to delay ambiguity resolution to enable a machine-reading system to maintain multiple candidate interpretations. This is useful because typically, as the system reads through texts, evidence accumulates to help the knowledge integration system resolve ambiguities correctly. To avoid a combinatorial explosion in the number of candidate interpretations, we propose the packed representation to compactly encode all the candidates. Also, we present an algorithm that prunes interpretations from the packed representation as evidence accumulates. We evaluate our work by building and testing two prototype machine reading systems and measuring the quality of the knowledge bases they construct. The evaluation shows that our knowledge integration algorithms improve the cohesiveness of the knowledge bases, indicating their improved ability to support automated reasoning and question answering. The evaluation also shows that our approach to postponing ambiguity resolution improves the system’s accuracy at text interpretation. === text
author	Kim, Doo Soon
author_facet	Kim, Doo Soon
author_sort	Kim, Doo Soon
title	Knowledge integration in machine reading
title_short	Knowledge integration in machine reading
title_full	Knowledge integration in machine reading
title_fullStr	Knowledge integration in machine reading
title_full_unstemmed	Knowledge integration in machine reading
title_sort	knowledge integration in machine reading
publishDate	2011
url	http://hdl.handle.net/2152/ETD-UT-2011-08-4049
work_keys_str_mv	AT kimdoosoon knowledgeintegrationinmachinereading
_version_	1716822123265130496

Knowledge integration in machine reading

Similar Items