An evaluation of Lolita and related natural language processing systems

This research addresses the question, "how do we evaluate systems like LOLITA?" LOLITA is the Natural Language Processing (NLP) system under development at the University of Durham. It is intended as a platform for building NL applications. We are therefore interested in questions of evalu...

Full description

Bibliographic Details
Main Author:	Callaghan, Paul
Published:	Durham University 1998
Subjects:	005 Message understanding
Online Access:	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.242662

id	ndltd-bl.uk-oai-ethos.bl.uk-242662
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-2426622015-05-02T03:26:17ZAn evaluation of Lolita and related natural language processing systemsCallaghan, Paul1998This research addresses the question, "how do we evaluate systems like LOLITA?" LOLITA is the Natural Language Processing (NLP) system under development at the University of Durham. It is intended as a platform for building NL applications. We are therefore interested in questions of evaluation for such general NLP systems. The thesis has two, parts. The first, and main, part concerns the participation of LOLITA in the Sixth Message Understanding Conference (MUC-6). The MUC-relevant portion of LOLITA is described in detail. The adaptation of LOLITA for MUC-6 is discussed, including work undertaken by the author. Performance on a specimen article is analysed qualitatively, and in detail, with anonymous comparisons to competitors' output. We also examine current LOLITA performance. A template comparison tool was implemented to aid these analyses. The overall scores are then considered. A methodology for analysis is discussed, and a comparison made with current scores. The comparison tool is used to analyse how systems performed relative to each-other. One method, Correctness Analysis, was particularly interesting. It provides a characterisation of task difficulty, and indicates how systems approached a task. Finally, MUC-6 is analysed. In particular, we consider the methodology and ways of interpreting the results. Several criticisms of MUC-6 are made, along with suggestions for future MUC-style events. The second part considers evaluation from the point of view of general systems. A literature review shows a lack of serious work on this aspect of evaluation. A first principles discussion of evaluation, starting from a view of NL systems as a particular kind of software, raises several interesting points for single task evaluation. No evaluations could be suggested for general systems; their value was seen as primarily economic. That is, we are unable to analyse their linguistic capability directly.005Message understandingDurham Universityhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.242662http://etheses.dur.ac.uk/5024/Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	005 Message understanding
spellingShingle	005 Message understanding Callaghan, Paul An evaluation of Lolita and related natural language processing systems
description	This research addresses the question, "how do we evaluate systems like LOLITA?" LOLITA is the Natural Language Processing (NLP) system under development at the University of Durham. It is intended as a platform for building NL applications. We are therefore interested in questions of evaluation for such general NLP systems. The thesis has two, parts. The first, and main, part concerns the participation of LOLITA in the Sixth Message Understanding Conference (MUC-6). The MUC-relevant portion of LOLITA is described in detail. The adaptation of LOLITA for MUC-6 is discussed, including work undertaken by the author. Performance on a specimen article is analysed qualitatively, and in detail, with anonymous comparisons to competitors' output. We also examine current LOLITA performance. A template comparison tool was implemented to aid these analyses. The overall scores are then considered. A methodology for analysis is discussed, and a comparison made with current scores. The comparison tool is used to analyse how systems performed relative to each-other. One method, Correctness Analysis, was particularly interesting. It provides a characterisation of task difficulty, and indicates how systems approached a task. Finally, MUC-6 is analysed. In particular, we consider the methodology and ways of interpreting the results. Several criticisms of MUC-6 are made, along with suggestions for future MUC-style events. The second part considers evaluation from the point of view of general systems. A literature review shows a lack of serious work on this aspect of evaluation. A first principles discussion of evaluation, starting from a view of NL systems as a particular kind of software, raises several interesting points for single task evaluation. No evaluations could be suggested for general systems; their value was seen as primarily economic. That is, we are unable to analyse their linguistic capability directly.
author	Callaghan, Paul
author_facet	Callaghan, Paul
author_sort	Callaghan, Paul
title	An evaluation of Lolita and related natural language processing systems
title_short	An evaluation of Lolita and related natural language processing systems
title_full	An evaluation of Lolita and related natural language processing systems
title_fullStr	An evaluation of Lolita and related natural language processing systems
title_full_unstemmed	An evaluation of Lolita and related natural language processing systems
title_sort	evaluation of lolita and related natural language processing systems
publisher	Durham University
publishDate	1998
url	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.242662
work_keys_str_mv	AT callaghanpaul anevaluationoflolitaandrelatednaturallanguageprocessingsystems AT callaghanpaul evaluationoflolitaandrelatednaturallanguageprocessingsystems
_version_	1716802606530035712

An evaluation of Lolita and related natural language processing systems

Similar Items