Syllabification and parameter optimisation in Zulu to English machine translation
We present a series of experiments involving the machine translation of Zulu to English using a well-known statistical software system. Due to morphological complexity and relative scarcity of resources, the case of Zulu is challenging. Against a selection of baseline models, we show that a relative...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
South African Institute of Computer Scientists and Information Technologists
2015-12-01
|
Series: | South African Computer Journal |
Subjects: | |
Online Access: | http://sacj.cs.uct.ac.za/index.php/sacj/article/view/323 |
id |
doaj-91a33596dd9140b89d4aef06f664b0be |
---|---|
record_format |
Article |
spelling |
doaj-91a33596dd9140b89d4aef06f664b0be2020-11-25T00:30:22ZengSouth African Institute of Computer Scientists and Information TechnologistsSouth African Computer Journal1015-79992313-78352015-12-0105710.18489/sacj.v0i57.323143Syllabification and parameter optimisation in Zulu to English machine translationGideon Kotzé0Friedel Wolff1School of Computing University of South AfricaSchool of Computing University of South AfricaWe present a series of experiments involving the machine translation of Zulu to English using a well-known statistical software system. Due to morphological complexity and relative scarcity of resources, the case of Zulu is challenging. Against a selection of baseline models, we show that a relatively naive approach of dividing Zulu words into syllables leads to a surprising improvement. We further improve on this model through manual configuration changes. Our best model significantly outperforms the baseline models (BLEU measure, at p < 0.001) even when they are optimised to a similar degree, only falling short of the well-known Morfessor morphological analyser that makes use of relatively sophisticated algorithms. These experiments suggest that even a simple optimisation procedure can improve the quality of this approach to a significant degree. This is promising particularly because it improves on a mostly language independent approach — at least within the same language family. Our work also drives the point home that sub-lexical alignment for Zulu is crucial for improved translation quality.http://sacj.cs.uct.ac.za/index.php/sacj/article/view/323machine translationword segmentationalignmentZuluEnglish |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Gideon Kotzé Friedel Wolff |
spellingShingle |
Gideon Kotzé Friedel Wolff Syllabification and parameter optimisation in Zulu to English machine translation South African Computer Journal machine translation word segmentation alignment Zulu English |
author_facet |
Gideon Kotzé Friedel Wolff |
author_sort |
Gideon Kotzé |
title |
Syllabification and parameter optimisation in Zulu to English machine translation |
title_short |
Syllabification and parameter optimisation in Zulu to English machine translation |
title_full |
Syllabification and parameter optimisation in Zulu to English machine translation |
title_fullStr |
Syllabification and parameter optimisation in Zulu to English machine translation |
title_full_unstemmed |
Syllabification and parameter optimisation in Zulu to English machine translation |
title_sort |
syllabification and parameter optimisation in zulu to english machine translation |
publisher |
South African Institute of Computer Scientists and Information Technologists |
series |
South African Computer Journal |
issn |
1015-7999 2313-7835 |
publishDate |
2015-12-01 |
description |
We present a series of experiments involving the machine translation of Zulu to English using a well-known statistical software system. Due to morphological complexity and relative scarcity of resources, the case of Zulu is challenging. Against a selection of baseline models, we show that a relatively naive approach of dividing Zulu words into syllables leads to a surprising improvement. We further improve on this model through manual configuration changes. Our best model significantly outperforms the baseline models (BLEU measure, at p < 0.001) even when they are optimised to a similar degree, only falling short of the well-known Morfessor morphological analyser that makes use of relatively sophisticated algorithms. These experiments suggest that even a simple optimisation procedure can improve the quality of this approach to a significant degree. This is promising particularly because it improves on a mostly language independent approach — at least within the same language family. Our work also drives the point home that sub-lexical alignment for Zulu is crucial for improved translation quality. |
topic |
machine translation word segmentation alignment Zulu English |
url |
http://sacj.cs.uct.ac.za/index.php/sacj/article/view/323 |
work_keys_str_mv |
AT gideonkotze syllabificationandparameteroptimisationinzulutoenglishmachinetranslation AT friedelwolff syllabificationandparameteroptimisationinzulutoenglishmachinetranslation |
_version_ |
1725327106596929536 |