Instance-based Natural Language Generation

In recent years, ranking approaches to Natural Language Generation have become increasingly popular. They abandon the idea of generation as a deterministic decision-making process in favour of approaches that combine overgeneration with ranking at some stage in processing. In this thesis, we investi...

Full description

Bibliographic Details
Main Author: Varges, Sebastian
Published: University of Edinburgh 2003
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.735392
id ndltd-bl.uk-oai-ethos.bl.uk-735392
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-7353922018-05-12T03:19:47ZInstance-based Natural Language GenerationVarges, Sebastian2003In recent years, ranking approaches to Natural Language Generation have become increasingly popular. They abandon the idea of generation as a deterministic decision-making process in favour of approaches that combine overgeneration with ranking at some stage in processing. In this thesis, we investigate the use of instance-based ranking methods for surface realization in Natural Language Generation. Our approach to instance-based Natural Language Generation employs two basic components: a rule system that generates a number of realization candidates from a meaning representation and an instance-based ranker that scores the candidates according to their similarity to examples taken from a training corpus. The instance-based ranker uses information retrieval methods to rank output candidates. Our approach is corpus-based in that it uses a treebank (a subset of the Penn Treebank II containing management succession texts) in combination with manual semantic markup to automatically produce a generation grammar. Furthermore, the corpus is also used by the instance-based ranker. The semantic annotation of a test portion of the compiled subcorpus serves as input to the generator. In this thesis, we develop an efficient search technique for identifying the optimal candidate based on the A*-algorithm, detail the annotation scheme and grammar construction algorithm and show how a Rete-based production system can be used for efficient candidate generation. Furthermore, we examine the output of the generator and discuss issues like input coverage (completeness), fluency and faithfulness that are relevant to surface generation in general.University of Edinburghhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.735392http://hdl.handle.net/1842/27574Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
description In recent years, ranking approaches to Natural Language Generation have become increasingly popular. They abandon the idea of generation as a deterministic decision-making process in favour of approaches that combine overgeneration with ranking at some stage in processing. In this thesis, we investigate the use of instance-based ranking methods for surface realization in Natural Language Generation. Our approach to instance-based Natural Language Generation employs two basic components: a rule system that generates a number of realization candidates from a meaning representation and an instance-based ranker that scores the candidates according to their similarity to examples taken from a training corpus. The instance-based ranker uses information retrieval methods to rank output candidates. Our approach is corpus-based in that it uses a treebank (a subset of the Penn Treebank II containing management succession texts) in combination with manual semantic markup to automatically produce a generation grammar. Furthermore, the corpus is also used by the instance-based ranker. The semantic annotation of a test portion of the compiled subcorpus serves as input to the generator. In this thesis, we develop an efficient search technique for identifying the optimal candidate based on the A*-algorithm, detail the annotation scheme and grammar construction algorithm and show how a Rete-based production system can be used for efficient candidate generation. Furthermore, we examine the output of the generator and discuss issues like input coverage (completeness), fluency and faithfulness that are relevant to surface generation in general.
author Varges, Sebastian
spellingShingle Varges, Sebastian
Instance-based Natural Language Generation
author_facet Varges, Sebastian
author_sort Varges, Sebastian
title Instance-based Natural Language Generation
title_short Instance-based Natural Language Generation
title_full Instance-based Natural Language Generation
title_fullStr Instance-based Natural Language Generation
title_full_unstemmed Instance-based Natural Language Generation
title_sort instance-based natural language generation
publisher University of Edinburgh
publishDate 2003
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.735392
work_keys_str_mv AT vargessebastian instancebasednaturallanguagegeneration
_version_ 1718636598005858304