Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic
In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Perceptron tagger, to tag Icelandic, a morphologically complex language. By adding languagespecific linguistic features and using IceMorphy, an unknown word guesser, we obtain state-of- the-art tagging accu...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Stockholms universitet, Avdelningen för datorlingvistik
2013
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-90304 |
id |
ndltd-UPSALLA1-oai-DiVA.org-su-90304 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-su-903042013-06-06T04:27:57ZTagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of IcelandicengÖstling, RobertStockholms universitet, Avdelningen för datorlingvistikLinköping University Electronic Press, Linköpings universitet2013part of speech taggingpos taggingicelandicIn this paper, we experiment with using Stagger, an open-source implementation of an Averaged Perceptron tagger, to tag Icelandic, a morphologically complex language. By adding languagespecific linguistic features and using IceMorphy, an unknown word guesser, we obtain state-of- the-art tagging accuracy of 92.82%. Furthermore, by adding data from a morphological database, and word embeddings induced from an unannotated corpus, the accuracy increases to 93.84%. This is equivalent to an error reduction of 5.5%, compared to the previously best tagger for Icelandic, consisting of linguistic rules and a Hidden Markov Model. Conference paperinfo:eu-repo/semantics/conferenceObjecttexthttp://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-90304Linköping Electronic Conference Proceedings, 1650-3740Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013), p. 105-119application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
part of speech tagging pos tagging icelandic |
spellingShingle |
part of speech tagging pos tagging icelandic Östling, Robert Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic |
description |
In this paper, we experiment with using Stagger, an open-source implementation of an Averaged Perceptron tagger, to tag Icelandic, a morphologically complex language. By adding languagespecific linguistic features and using IceMorphy, an unknown word guesser, we obtain state-of- the-art tagging accuracy of 92.82%. Furthermore, by adding data from a morphological database, and word embeddings induced from an unannotated corpus, the accuracy increases to 93.84%. This is equivalent to an error reduction of 5.5%, compared to the previously best tagger for Icelandic, consisting of linguistic rules and a Hidden Markov Model. |
author |
Östling, Robert |
author_facet |
Östling, Robert |
author_sort |
Östling, Robert |
title |
Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic |
title_short |
Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic |
title_full |
Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic |
title_fullStr |
Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic |
title_full_unstemmed |
Tagging a Morphologically Complex Language Using an Averaged Perceptron Tagger: The Case of Icelandic |
title_sort |
tagging a morphologically complex language using an averaged perceptron tagger: the case of icelandic |
publisher |
Stockholms universitet, Avdelningen för datorlingvistik |
publishDate |
2013 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-90304 |
work_keys_str_mv |
AT ostlingrobert taggingamorphologicallycomplexlanguageusinganaveragedperceptrontaggerthecaseoficelandic |
_version_ |
1716588681611968512 |