Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation

Recent developments in machine translation experiment with the idea that a model can improve the translation quality by performing multiple tasks, e.g., translating from source to target and also labeling each source word with syntactic information. The intuition is that the network would generalize...

Full description

Bibliographic Details
Main Authors: Kondratyuk Daniel, Cardenas Ronald, Bojar Ondřej
Format: Article
Language:English
Published: Sciendo 2019-10-01
Series:Prague Bulletin of Mathematical Linguistics
Online Access:https://doi.org/10.2478/pralin-2019-0005
id doaj-7a7c7ec0f58343ffa605912b26a6d2a6
record_format Article
spelling doaj-7a7c7ec0f58343ffa605912b26a6d2a62021-09-05T14:01:12ZengSciendoPrague Bulletin of Mathematical Linguistics 1804-04622019-10-011131314010.2478/pralin-2019-0005pralin-2019-0005Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine TranslationKondratyuk Daniel0Cardenas Ronald1Bojar Ondřej2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied LinguisticsCharles University, Faculty of Mathematics and Physics, Institute of Formal and Applied LinguisticsCharles University, Faculty of Mathematics and Physics, Institute of Formal and Applied LinguisticsRecent developments in machine translation experiment with the idea that a model can improve the translation quality by performing multiple tasks, e.g., translating from source to target and also labeling each source word with syntactic information. The intuition is that the network would generalize knowledge over the multiple tasks, improving the translation performance, especially in low resource conditions. We devised an experiment that casts doubt on this intuition. We perform similar experiments in both multi-decoder and interleaving setups that label each target word either with a syntactic tag or a completely random tag. Surprisingly, we show that the model performs nearly as well on uncorrelated random tags as on true syntactic tags. We hint some possible explanations of this behavior.https://doi.org/10.2478/pralin-2019-0005
collection DOAJ
language English
format Article
sources DOAJ
author Kondratyuk Daniel
Cardenas Ronald
Bojar Ondřej
spellingShingle Kondratyuk Daniel
Cardenas Ronald
Bojar Ondřej
Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation
Prague Bulletin of Mathematical Linguistics
author_facet Kondratyuk Daniel
Cardenas Ronald
Bojar Ondřej
author_sort Kondratyuk Daniel
title Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation
title_short Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation
title_full Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation
title_fullStr Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation
title_full_unstemmed Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation
title_sort replacing linguists with dummies: a serious need for trivial baselines in multi-task neural machine translation
publisher Sciendo
series Prague Bulletin of Mathematical Linguistics
issn 1804-0462
publishDate 2019-10-01
description Recent developments in machine translation experiment with the idea that a model can improve the translation quality by performing multiple tasks, e.g., translating from source to target and also labeling each source word with syntactic information. The intuition is that the network would generalize knowledge over the multiple tasks, improving the translation performance, especially in low resource conditions. We devised an experiment that casts doubt on this intuition. We perform similar experiments in both multi-decoder and interleaving setups that label each target word either with a syntactic tag or a completely random tag. Surprisingly, we show that the model performs nearly as well on uncorrelated random tags as on true syntactic tags. We hint some possible explanations of this behavior.
url https://doi.org/10.2478/pralin-2019-0005
work_keys_str_mv AT kondratyukdaniel replacinglinguistswithdummiesaseriousneedfortrivialbaselinesinmultitaskneuralmachinetranslation
AT cardenasronald replacinglinguistswithdummiesaseriousneedfortrivialbaselinesinmultitaskneuralmachinetranslation
AT bojarondrej replacinglinguistswithdummiesaseriousneedfortrivialbaselinesinmultitaskneuralmachinetranslation
_version_ 1717810614960652288