Syllabification and parameter optimisation in Zulu to English machine translation

We present a series of experiments involving the machine translation of Zulu to English using a well-known statistical software system. Due to morphological complexity and relative scarcity of resources, the case of Zulu is challenging. Against a selection of baseline models, we show that a relative...

Full description

Bibliographic Details
Main Authors: Gideon Kotzé, Friedel Wolff
Format: Article
Language:English
Published: South African Institute of Computer Scientists and Information Technologists 2015-12-01
Series:South African Computer Journal
Subjects:
Online Access:http://sacj.cs.uct.ac.za/index.php/sacj/article/view/323
id doaj-91a33596dd9140b89d4aef06f664b0be
record_format Article
spelling doaj-91a33596dd9140b89d4aef06f664b0be2020-11-25T00:30:22ZengSouth African Institute of Computer Scientists and Information TechnologistsSouth African Computer Journal1015-79992313-78352015-12-0105710.18489/sacj.v0i57.323143Syllabification and parameter optimisation in Zulu to English machine translationGideon Kotzé0Friedel Wolff1School of Computing University of South AfricaSchool of Computing University of South AfricaWe present a series of experiments involving the machine translation of Zulu to English using a well-known statistical software system. Due to morphological complexity and relative scarcity of resources, the case of Zulu is challenging. Against a selection of baseline models, we show that a relatively naive approach of dividing Zulu words into syllables leads to a surprising improvement. We further improve on this model through manual configuration changes. Our best model significantly outperforms the baseline models (BLEU measure, at p < 0.001) even when they are optimised to a similar degree, only falling short of the well-known Morfessor morphological analyser that makes use of relatively sophisticated algorithms. These experiments suggest that even a simple optimisation procedure can improve the quality of this approach to a significant degree. This is promising particularly because it improves on a mostly language independent approach — at least within the same language family. Our work also drives the point home that sub-lexical alignment for Zulu is crucial for improved translation quality.http://sacj.cs.uct.ac.za/index.php/sacj/article/view/323machine translationword segmentationalignmentZuluEnglish
collection DOAJ
language English
format Article
sources DOAJ
author Gideon Kotzé
Friedel Wolff
spellingShingle Gideon Kotzé
Friedel Wolff
Syllabification and parameter optimisation in Zulu to English machine translation
South African Computer Journal
machine translation
word segmentation
alignment
Zulu
English
author_facet Gideon Kotzé
Friedel Wolff
author_sort Gideon Kotzé
title Syllabification and parameter optimisation in Zulu to English machine translation
title_short Syllabification and parameter optimisation in Zulu to English machine translation
title_full Syllabification and parameter optimisation in Zulu to English machine translation
title_fullStr Syllabification and parameter optimisation in Zulu to English machine translation
title_full_unstemmed Syllabification and parameter optimisation in Zulu to English machine translation
title_sort syllabification and parameter optimisation in zulu to english machine translation
publisher South African Institute of Computer Scientists and Information Technologists
series South African Computer Journal
issn 1015-7999
2313-7835
publishDate 2015-12-01
description We present a series of experiments involving the machine translation of Zulu to English using a well-known statistical software system. Due to morphological complexity and relative scarcity of resources, the case of Zulu is challenging. Against a selection of baseline models, we show that a relatively naive approach of dividing Zulu words into syllables leads to a surprising improvement. We further improve on this model through manual configuration changes. Our best model significantly outperforms the baseline models (BLEU measure, at p < 0.001) even when they are optimised to a similar degree, only falling short of the well-known Morfessor morphological analyser that makes use of relatively sophisticated algorithms. These experiments suggest that even a simple optimisation procedure can improve the quality of this approach to a significant degree. This is promising particularly because it improves on a mostly language independent approach — at least within the same language family. Our work also drives the point home that sub-lexical alignment for Zulu is crucial for improved translation quality.
topic machine translation
word segmentation
alignment
Zulu
English
url http://sacj.cs.uct.ac.za/index.php/sacj/article/view/323
work_keys_str_mv AT gideonkotze syllabificationandparameteroptimisationinzulutoenglishmachinetranslation
AT friedelwolff syllabificationandparameteroptimisationinzulutoenglishmachinetranslation
_version_ 1725327106596929536