Complexity and expressiveness for formal structures in Natural Language Processing

The formalized and algorithmic study of human language within the field of Natural Language Processing (NLP) has motivated much theoretical work in the related field of formal languages, in particular the subfields of grammar and automata theory. Motivated and informed by NLP, the papers in this the...

Full description

Bibliographic Details
Main Author: Ericson, Petter
Format: Others
Language:English
Published: Umeå universitet, Institutionen för datavetenskap 2017
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-135014
http://nbn-resolving.de/urn:isbn:9789176017227
id ndltd-UPSALLA1-oai-DiVA.org-umu-135014
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-umu-1350142018-06-10T05:17:53ZComplexity and expressiveness for formal structures in Natural Language ProcessingengEricson, PetterUmeå universitet, Institutionen för datavetenskapUmeå : Umeå Universitet2017graph grammarsformal languagesnatural language processingparameterized complexityabstract meaning representationtree automatadeterministic tree-walking transducersmildly context-sensitive languageshyperedge replacementtree adjoining languagesminimally adequate teacherComputer SciencesDatavetenskap (datalogi)The formalized and algorithmic study of human language within the field of Natural Language Processing (NLP) has motivated much theoretical work in the related field of formal languages, in particular the subfields of grammar and automata theory. Motivated and informed by NLP, the papers in this thesis explore the connections between expressibility – that is, the ability for a formal system to define complex sets of objects – and algorithmic complexity – that is, the varying amount of effort required to analyse and utilise such systems. Our research studies formal systems working not just on strings, but on more complex structures such as trees and graphs, in particular syntax trees and semantic graphs. The field of mildly context-sensitive languages concerns attempts to find a useful class of formal languages between the context-free and context-sensitive. We study formalisms defining two candidates for this class; tree-adjoining languages and the languages defined by linear context-free rewriting systems. For the former, we specifically investigate the tree languages, and define a subclass and tree automaton with linear parsing complexity. For the latter, we use the framework of parameterized complexity theory to investigate more deeply the related parsing problems, as well as the connections between various formalisms defining the class. The field of semantic modelling aims towards formally and accurately modelling not only the syntax of natural language statements, but also the meaning. In particular, recent work in semantic graphs motivates our study of graph grammars and graph parsing. To the best of our knowledge, the formalism presented in Paper III of this thesis is the first graph grammar where the uniform parsing problem has polynomial parsing complexity, even for input graphs of unbounded node degree. Licentiate thesis, comprehensive summaryinfo:eu-repo/semantics/masterThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-135014urn:isbn:9789176017227Report / UMINF, 0348-0542 ; 17.13application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic graph grammars
formal languages
natural language processing
parameterized complexity
abstract meaning representation
tree automata
deterministic tree-walking transducers
mildly context-sensitive languages
hyperedge replacement
tree adjoining languages
minimally adequate teacher
Computer Sciences
Datavetenskap (datalogi)
spellingShingle graph grammars
formal languages
natural language processing
parameterized complexity
abstract meaning representation
tree automata
deterministic tree-walking transducers
mildly context-sensitive languages
hyperedge replacement
tree adjoining languages
minimally adequate teacher
Computer Sciences
Datavetenskap (datalogi)
Ericson, Petter
Complexity and expressiveness for formal structures in Natural Language Processing
description The formalized and algorithmic study of human language within the field of Natural Language Processing (NLP) has motivated much theoretical work in the related field of formal languages, in particular the subfields of grammar and automata theory. Motivated and informed by NLP, the papers in this thesis explore the connections between expressibility – that is, the ability for a formal system to define complex sets of objects – and algorithmic complexity – that is, the varying amount of effort required to analyse and utilise such systems. Our research studies formal systems working not just on strings, but on more complex structures such as trees and graphs, in particular syntax trees and semantic graphs. The field of mildly context-sensitive languages concerns attempts to find a useful class of formal languages between the context-free and context-sensitive. We study formalisms defining two candidates for this class; tree-adjoining languages and the languages defined by linear context-free rewriting systems. For the former, we specifically investigate the tree languages, and define a subclass and tree automaton with linear parsing complexity. For the latter, we use the framework of parameterized complexity theory to investigate more deeply the related parsing problems, as well as the connections between various formalisms defining the class. The field of semantic modelling aims towards formally and accurately modelling not only the syntax of natural language statements, but also the meaning. In particular, recent work in semantic graphs motivates our study of graph grammars and graph parsing. To the best of our knowledge, the formalism presented in Paper III of this thesis is the first graph grammar where the uniform parsing problem has polynomial parsing complexity, even for input graphs of unbounded node degree.
author Ericson, Petter
author_facet Ericson, Petter
author_sort Ericson, Petter
title Complexity and expressiveness for formal structures in Natural Language Processing
title_short Complexity and expressiveness for formal structures in Natural Language Processing
title_full Complexity and expressiveness for formal structures in Natural Language Processing
title_fullStr Complexity and expressiveness for formal structures in Natural Language Processing
title_full_unstemmed Complexity and expressiveness for formal structures in Natural Language Processing
title_sort complexity and expressiveness for formal structures in natural language processing
publisher Umeå universitet, Institutionen för datavetenskap
publishDate 2017
url http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-135014
http://nbn-resolving.de/urn:isbn:9789176017227
work_keys_str_mv AT ericsonpetter complexityandexpressivenessforformalstructuresinnaturallanguageprocessing
_version_ 1718693591737434112