Training Tips for the Transformer Model

This article describes our experiments in neural machine translation using the recent Tensor2Tensor framework and the Transformer sequence-to-sequence model (Vaswani et al., 2017). We examine some of the critical parameters that affect the final translation quality, memory usage, training stability...

Full description

Bibliographic Details
Main Authors:	Popel Martin, Bojar Ondřej
Format:	Article
Language:	English
Published:	Sciendo 2018-04-01
Series:	Prague Bulletin of Mathematical Linguistics
Online Access:	https://doi.org/10.2478/pralin-2018-0002

Internet

https://doi.org/10.2478/pralin-2018-0002

Training Tips for the Transformer Model

Internet

Similar Items