Využití hrubé reprezentace slov ve strojovém překladu do češtiny

In this thesis we deal with the possibilities of the coarse word representation in machine translation from German and Hungarian into Czech. First, we compare the different tools for splitting of German and Hungarian compounds. For Hungarian we additionally designed several variants of nouns splitti...

Full description

Bibliographic Details
Main Author: Tlustý, Marek
Other Authors: Bojar, Ondřej
Format: Dissertation
Language:Czech
Published: 2016
Online Access:http://www.nusl.cz/ntk/nusl-352567
Description
Summary:In this thesis we deal with the possibilities of the coarse word representation in machine translation from German and Hungarian into Czech. First, we compare the different tools for splitting of German and Hungarian compounds. For Hungarian we additionally designed several variants of nouns splitting. Then we experiment with word classes, where we combine splitting of words and several different configurations of word classes. Specially we use the bilingual classes. After that comparison for a translation from German or Hungarian into Czech is made. Outputs are evaluated by automatic metrics BLEU and METEOR. The best configurations are evaluated manually afterwards. It turns out that the solitary splitting of German compounds and Hungarian nouns does not lead to much better results when translated into Czech. In combination with the word classes there is a noticable improvement.