Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation
Recent developments in machine translation experiment with the idea that a model can improve the translation quality by performing multiple tasks, e.g., translating from source to target and also labeling each source word with syntactic information. The intuition is that the network would generalize...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2019-10-01
|
Series: | Prague Bulletin of Mathematical Linguistics |
Online Access: | https://doi.org/10.2478/pralin-2019-0005 |
id |
doaj-7a7c7ec0f58343ffa605912b26a6d2a6 |
---|---|
record_format |
Article |
spelling |
doaj-7a7c7ec0f58343ffa605912b26a6d2a62021-09-05T14:01:12ZengSciendoPrague Bulletin of Mathematical Linguistics 1804-04622019-10-011131314010.2478/pralin-2019-0005pralin-2019-0005Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine TranslationKondratyuk Daniel0Cardenas Ronald1Bojar Ondřej2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied LinguisticsCharles University, Faculty of Mathematics and Physics, Institute of Formal and Applied LinguisticsCharles University, Faculty of Mathematics and Physics, Institute of Formal and Applied LinguisticsRecent developments in machine translation experiment with the idea that a model can improve the translation quality by performing multiple tasks, e.g., translating from source to target and also labeling each source word with syntactic information. The intuition is that the network would generalize knowledge over the multiple tasks, improving the translation performance, especially in low resource conditions. We devised an experiment that casts doubt on this intuition. We perform similar experiments in both multi-decoder and interleaving setups that label each target word either with a syntactic tag or a completely random tag. Surprisingly, we show that the model performs nearly as well on uncorrelated random tags as on true syntactic tags. We hint some possible explanations of this behavior.https://doi.org/10.2478/pralin-2019-0005 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Kondratyuk Daniel Cardenas Ronald Bojar Ondřej |
spellingShingle |
Kondratyuk Daniel Cardenas Ronald Bojar Ondřej Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation Prague Bulletin of Mathematical Linguistics |
author_facet |
Kondratyuk Daniel Cardenas Ronald Bojar Ondřej |
author_sort |
Kondratyuk Daniel |
title |
Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation |
title_short |
Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation |
title_full |
Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation |
title_fullStr |
Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation |
title_full_unstemmed |
Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation |
title_sort |
replacing linguists with dummies: a serious need for trivial baselines in multi-task neural machine translation |
publisher |
Sciendo |
series |
Prague Bulletin of Mathematical Linguistics |
issn |
1804-0462 |
publishDate |
2019-10-01 |
description |
Recent developments in machine translation experiment with the idea that a model can improve the translation quality by performing multiple tasks, e.g., translating from source to target and also labeling each source word with syntactic information. The intuition is that the network would generalize knowledge over the multiple tasks, improving the translation performance, especially in low resource conditions. We devised an experiment that casts doubt on this intuition. We perform similar experiments in both multi-decoder and interleaving setups that label each target word either with a syntactic tag or a completely random tag. Surprisingly, we show that the model performs nearly as well on uncorrelated random tags as on true syntactic tags. We hint some possible explanations of this behavior. |
url |
https://doi.org/10.2478/pralin-2019-0005 |
work_keys_str_mv |
AT kondratyukdaniel replacinglinguistswithdummiesaseriousneedfortrivialbaselinesinmultitaskneuralmachinetranslation AT cardenasronald replacinglinguistswithdummiesaseriousneedfortrivialbaselinesinmultitaskneuralmachinetranslation AT bojarondrej replacinglinguistswithdummiesaseriousneedfortrivialbaselinesinmultitaskneuralmachinetranslation |
_version_ |
1717810614960652288 |