Augmenting Neural Machine Translation through Round-Trip Training Approach

The quality of Neural Machine Translation (NMT), as a data-driven approach, massively depends on quantity, quality and relevance of the training dataset. Such approaches have achieved promising results for bilingually high-resource scenarios but are inadequate for low-resource conditions. Generally,...

Full description

Bibliographic Details
Main Authors: Ahmadnia Benyamin, Dorr Bonnie J.
Format: Article
Language:English
Published: De Gruyter 2019-10-01
Series:Open Computer Science
Subjects:
Online Access:http://www.degruyter.com/view/j/comp.2019.9.issue-1/comp-2019-0019/comp-2019-0019.xml?format=INT
Description
Summary:The quality of Neural Machine Translation (NMT), as a data-driven approach, massively depends on quantity, quality and relevance of the training dataset. Such approaches have achieved promising results for bilingually high-resource scenarios but are inadequate for low-resource conditions. Generally, the NMT systems learn from millions of words from bilingual training dataset. However, human labeling process is very costly and time consuming. In this paper, we describe a round-trip training approach to bilingual low-resource NMT that takes advantage of monolingual datasets to address training data bottleneck, thus augmenting translation quality. We conduct detailed experiments on English-Spanish as a high-resource language pair as well as Persian-Spanish as a low-resource language pair. Experimental results show that this competitive approach outperforms the baseline systems and improves translation quality.
ISSN:2299-1093