Complexity and expressiveness for formal structures in Natural Language Processing
The formalized and algorithmic study of human language within the field of Natural Language Processing (NLP) has motivated much theoretical work in the related field of formal languages, in particular the subfields of grammar and automata theory. Motivated and informed by NLP, the papers in this the...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Umeå universitet, Institutionen för datavetenskap
2017
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-135014 http://nbn-resolving.de/urn:isbn:9789176017227 |
id |
ndltd-UPSALLA1-oai-DiVA.org-umu-135014 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-umu-1350142018-06-10T05:17:53ZComplexity and expressiveness for formal structures in Natural Language ProcessingengEricson, PetterUmeå universitet, Institutionen för datavetenskapUmeå : Umeå Universitet2017graph grammarsformal languagesnatural language processingparameterized complexityabstract meaning representationtree automatadeterministic tree-walking transducersmildly context-sensitive languageshyperedge replacementtree adjoining languagesminimally adequate teacherComputer SciencesDatavetenskap (datalogi)The formalized and algorithmic study of human language within the field of Natural Language Processing (NLP) has motivated much theoretical work in the related field of formal languages, in particular the subfields of grammar and automata theory. Motivated and informed by NLP, the papers in this thesis explore the connections between expressibility – that is, the ability for a formal system to define complex sets of objects – and algorithmic complexity – that is, the varying amount of effort required to analyse and utilise such systems. Our research studies formal systems working not just on strings, but on more complex structures such as trees and graphs, in particular syntax trees and semantic graphs. The field of mildly context-sensitive languages concerns attempts to find a useful class of formal languages between the context-free and context-sensitive. We study formalisms defining two candidates for this class; tree-adjoining languages and the languages defined by linear context-free rewriting systems. For the former, we specifically investigate the tree languages, and define a subclass and tree automaton with linear parsing complexity. For the latter, we use the framework of parameterized complexity theory to investigate more deeply the related parsing problems, as well as the connections between various formalisms defining the class. The field of semantic modelling aims towards formally and accurately modelling not only the syntax of natural language statements, but also the meaning. In particular, recent work in semantic graphs motivates our study of graph grammars and graph parsing. To the best of our knowledge, the formalism presented in Paper III of this thesis is the first graph grammar where the uniform parsing problem has polynomial parsing complexity, even for input graphs of unbounded node degree. Licentiate thesis, comprehensive summaryinfo:eu-repo/semantics/masterThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-135014urn:isbn:9789176017227Report / UMINF, 0348-0542 ; 17.13application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
graph grammars formal languages natural language processing parameterized complexity abstract meaning representation tree automata deterministic tree-walking transducers mildly context-sensitive languages hyperedge replacement tree adjoining languages minimally adequate teacher Computer Sciences Datavetenskap (datalogi) |
spellingShingle |
graph grammars formal languages natural language processing parameterized complexity abstract meaning representation tree automata deterministic tree-walking transducers mildly context-sensitive languages hyperedge replacement tree adjoining languages minimally adequate teacher Computer Sciences Datavetenskap (datalogi) Ericson, Petter Complexity and expressiveness for formal structures in Natural Language Processing |
description |
The formalized and algorithmic study of human language within the field of Natural Language Processing (NLP) has motivated much theoretical work in the related field of formal languages, in particular the subfields of grammar and automata theory. Motivated and informed by NLP, the papers in this thesis explore the connections between expressibility – that is, the ability for a formal system to define complex sets of objects – and algorithmic complexity – that is, the varying amount of effort required to analyse and utilise such systems. Our research studies formal systems working not just on strings, but on more complex structures such as trees and graphs, in particular syntax trees and semantic graphs. The field of mildly context-sensitive languages concerns attempts to find a useful class of formal languages between the context-free and context-sensitive. We study formalisms defining two candidates for this class; tree-adjoining languages and the languages defined by linear context-free rewriting systems. For the former, we specifically investigate the tree languages, and define a subclass and tree automaton with linear parsing complexity. For the latter, we use the framework of parameterized complexity theory to investigate more deeply the related parsing problems, as well as the connections between various formalisms defining the class. The field of semantic modelling aims towards formally and accurately modelling not only the syntax of natural language statements, but also the meaning. In particular, recent work in semantic graphs motivates our study of graph grammars and graph parsing. To the best of our knowledge, the formalism presented in Paper III of this thesis is the first graph grammar where the uniform parsing problem has polynomial parsing complexity, even for input graphs of unbounded node degree. |
author |
Ericson, Petter |
author_facet |
Ericson, Petter |
author_sort |
Ericson, Petter |
title |
Complexity and expressiveness for formal structures in Natural Language Processing |
title_short |
Complexity and expressiveness for formal structures in Natural Language Processing |
title_full |
Complexity and expressiveness for formal structures in Natural Language Processing |
title_fullStr |
Complexity and expressiveness for formal structures in Natural Language Processing |
title_full_unstemmed |
Complexity and expressiveness for formal structures in Natural Language Processing |
title_sort |
complexity and expressiveness for formal structures in natural language processing |
publisher |
Umeå universitet, Institutionen för datavetenskap |
publishDate |
2017 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-135014 http://nbn-resolving.de/urn:isbn:9789176017227 |
work_keys_str_mv |
AT ericsonpetter complexityandexpressivenessforformalstructuresinnaturallanguageprocessing |
_version_ |
1718693591737434112 |