Neřízená závistlostní analýza
Unsupervised dependency parsing is an alternative approach to identifying relations between words in a sentence. It does not require any annotated treebank, it is independent of language theory and universal across languages. However, its main disadvantage is its so far quite low parsing quality. Th...
Main Author: | |
---|---|
Other Authors: | |
Format: | Doctoral Thesis |
Language: | English |
Published: |
2012
|
Online Access: | http://www.nusl.cz/ntk/nusl-306310 |
id |
ndltd-nusl.cz-oai-invenio.nusl.cz-306310 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-nusl.cz-oai-invenio.nusl.cz-3063102021-03-29T05:12:13Z Neřízená závistlostní analýza Unsupervised Dependency Parsing Mareček, David Žabokrtský, Zdeněk Jurčíček, Filip Sogaard, Anders Unsupervised dependency parsing is an alternative approach to identifying relations between words in a sentence. It does not require any annotated treebank, it is independent of language theory and universal across languages. However, its main disadvantage is its so far quite low parsing quality. This thesis discusses some previous works and introduces a novel approach to unsupervised parsing. Our dependency model consists of four submodels: (i) edge model, which controls the distribution of governor-dependent pairs, (ii) fertility model, which controls the number of node's dependents, (iii) distance model, which controls the length of the dependency edges, and (iv) reducibility model. The reducibility model is based on a hypothesis that words that can be removed from a sentence without violating its grammaticality are leaves in the dependency tree. Induction of the dependency structures is done using Gibbs sampling method. We introduce a sampling algorithm that keeps the dependency trees projective, which is a very valuable constraint. In our experiments across 30 languages, we discuss the results of various settings of our models. Our method outperforms the previously reported results on a majority of the test languages. 2012 info:eu-repo/semantics/doctoralThesis http://www.nusl.cz/ntk/nusl-306310 eng info:eu-repo/semantics/restrictedAccess |
collection |
NDLTD |
language |
English |
format |
Doctoral Thesis |
sources |
NDLTD |
description |
Unsupervised dependency parsing is an alternative approach to identifying relations between words in a sentence. It does not require any annotated treebank, it is independent of language theory and universal across languages. However, its main disadvantage is its so far quite low parsing quality. This thesis discusses some previous works and introduces a novel approach to unsupervised parsing. Our dependency model consists of four submodels: (i) edge model, which controls the distribution of governor-dependent pairs, (ii) fertility model, which controls the number of node's dependents, (iii) distance model, which controls the length of the dependency edges, and (iv) reducibility model. The reducibility model is based on a hypothesis that words that can be removed from a sentence without violating its grammaticality are leaves in the dependency tree. Induction of the dependency structures is done using Gibbs sampling method. We introduce a sampling algorithm that keeps the dependency trees projective, which is a very valuable constraint. In our experiments across 30 languages, we discuss the results of various settings of our models. Our method outperforms the previously reported results on a majority of the test languages. |
author2 |
Žabokrtský, Zdeněk |
author_facet |
Žabokrtský, Zdeněk Mareček, David |
author |
Mareček, David |
spellingShingle |
Mareček, David Neřízená závistlostní analýza |
author_sort |
Mareček, David |
title |
Neřízená závistlostní analýza |
title_short |
Neřízená závistlostní analýza |
title_full |
Neřízená závistlostní analýza |
title_fullStr |
Neřízená závistlostní analýza |
title_full_unstemmed |
Neřízená závistlostní analýza |
title_sort |
neřízená závistlostní analýza |
publishDate |
2012 |
url |
http://www.nusl.cz/ntk/nusl-306310 |
work_keys_str_mv |
AT marecekdavid nerizenazavistlostnianalyza AT marecekdavid unsuperviseddependencyparsing |
_version_ |
1719389274037551104 |