Syllabification and parameter optimisation in Zulu to English machine translation

We present a series of experiments involving the machine translation of Zulu to English using a well-known statistical software system. Due to morphological complexity and relative scarcity of resources, the case of Zulu is challenging. Against a selection of baseline models, we show that a relative...

Full description

Bibliographic Details
Main Authors: Gideon Kotzé, Friedel Wolff
Format: Article
Language:English
Published: South African Institute of Computer Scientists and Information Technologists 2015-12-01
Series:South African Computer Journal
Subjects:
Online Access:http://sacj.cs.uct.ac.za/index.php/sacj/article/view/323
Description
Summary:We present a series of experiments involving the machine translation of Zulu to English using a well-known statistical software system. Due to morphological complexity and relative scarcity of resources, the case of Zulu is challenging. Against a selection of baseline models, we show that a relatively naive approach of dividing Zulu words into syllables leads to a surprising improvement. We further improve on this model through manual configuration changes. Our best model significantly outperforms the baseline models (BLEU measure, at p < 0.001) even when they are optimised to a similar degree, only falling short of the well-known Morfessor morphological analyser that makes use of relatively sophisticated algorithms. These experiments suggest that even a simple optimisation procedure can improve the quality of this approach to a significant degree. This is promising particularly because it improves on a mostly language independent approach — at least within the same language family. Our work also drives the point home that sub-lexical alignment for Zulu is crucial for improved translation quality.
ISSN:1015-7999
2313-7835