Developing a minimalist parser for free word order languages
We propose a parser for free word order languages, based on ideas from the Minimalist Program. The parser simulates aspects of a human listener who necessarily begins sentence analysis before all the words have become available. We first sketch the problems that free word order languages pose. One s...
Main Author: | |
---|---|
Format: | Others |
Language: | en |
Published: |
University of Ottawa (Canada)
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/10393/27031 http://dx.doi.org/10.20381/ruor-11883 |
Summary: | We propose a parser for free word order languages, based on ideas from the Minimalist Program. The parser simulates aspects of a human listener who necessarily begins sentence analysis before all the words have become available. We first sketch the problems that free word order languages pose. One such problem is discontinuous noun phrase constituency. Languages like Latin permit verbs, adjectives and so on to split noun phrases. We assume that the human parser assembles syntactic structures in the process of understanding a sentence; what happens to noun phrase fragments that arrive later in the derivation? Those that arrive earlier enter the existing syntactic structures, so they become less accessible. What mechanism best incorporates later fragments without undoing structures already built?
We show how difficult it is to make existing frameworks for minimalist parsing work for free word order languages and simulate realistic syntactic conditions. We briefly describe a formalism and a parsing algorithm that elegantly overcome these difficulties, and we illustrate them with detailed Latin examples. Previous formalisms for both minimalist generation and parsing tended to use cancellation of features as the primary mechanism for checking whether syntactic structures are compatible for merging into larger units. This is how words and phrases are marked as compatible and added to a larger structure. Instead, our formalism uses feature sets and unification-based operations in order to allow larger structures to acquire features from the smaller structures within them. They can then expose these features to discontinuous elements that arrive later in the derivation. In addition to the examples we provide for Latin, we provide English examples to demonstrate that this parsing algorithm can also be used with languages that require a more fixed order. After that, we discuss an implementation of this parsing algorithm written in Prolog.
We then discuss an extension to this formalism that allows it handle pro-drop languages, and we show how this can be elegantly extended to further enhance the scope of linguistic phenomena this parser can handle beyond pro-drop. Finally, we present a corpus study that justifies some of the limitations of this parser. |
---|