Case reuse in textual case-based reasoning

Text reuse involves reasoning with textual solutions of previous problems to solve new similar problems. It is an integral part of textual case-based reasoning (TCBR), which applies the CBR problem-solving methodology to situations where experiences are predominantly captured in text form. Here, we...

Full description

Bibliographic Details
Main Author: Adeyanju, Ibrahim Adepoju
Other Authors: Wiratunga, Nirmalie ; Lothian, Robert
Published: Robert Gordon University 2011
Subjects:
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.542914
Description
Summary:Text reuse involves reasoning with textual solutions of previous problems to solve new similar problems. It is an integral part of textual case-based reasoning (TCBR), which applies the CBR problem-solving methodology to situations where experiences are predominantly captured in text form. Here, we explore two key research questions in the context of textual reuse: firstly what parts of a solution are reusable given a problem and secondly how might these relevant parts be reused to generate a textual solution. Reasoning with text is naturally challenging and this is particularly so with text reuse. However significant inroads towards addressing this challenge was made possible with knowledge of problem-solution alignment. This knowledge allows us to identify specific parts of a textual solution that are linked to particular problem attributes or attribute values. Accordingly, a text reuse strategy based on implicit alignment is presented to determine textual solution constructs (words or phrases) that needs adapted. This addresses the question of what to reuse in solution texts and thereby forms the first contribution of this thesis. A generic architecture, the Case Retrieval Reuse Net (CR2N), is used to formalise the reuse strategy. Functionally, this architecture annotates textual constructs in a solution as reusable with adaptation or without adaptation. Key to this annotation is the discovery of reuse evidence mined from neighbourhood characteristics. Experimental results show significant improvements over a retrieve-only system and a baseline reuse technique. We also extended CR2N so that retrieval of similar cases is informed by solutions that are easiest to adapt. This is done by retrieving the top k cases based on their problem similarity and then determining the reusability of their solutions with respect to the target problem. Results from experiments show that reuse-guided retrieval outperforms retrieval without this guidance. Although CR2N exploits implicit alignment to aid text reuse, performance can be greatly improved if there is explicit alignment. Our second contribution is a method to form explicit alignment of structured problem attributes and values to sentences in a textual solution. Thereafter, compositional and transformational approaches to text reuse are introduced to address the question of how to reuse textual solutions. The main idea in the compositional approach is to generate a textual solution by using prototypical sentences across similar authors. While the transformation approach adapts the retrieved solution text by replacing sentences aligned to mismatched problem attributes using sentences from the neighbourhood. Experiments confirm the usefulness of these approaches through strong similarity between generated text and human references. The third and final contribution of this research is the use of Machine Translation (MT) evaluation metrics for TCBR. These metrics have been shown to correlate highly with human expert evaluation. In MT research, multiple human references are typically used as opposed to a single reference or solution per test case. An introspective approach to create multiple references for evaluation is presented. This is particularly useful for CBR domains where single reference cases (or cases with a single solution per problem) typically form the casebase. For such domains we show how multiple references can be generated by exploiting the CBR similarity assumption. Results indicate that TCBR systems evaluated with these MT metrics are closer to human judgements.