TLAT-NMT
TLAT-NMT copied to clipboard
Source code for the EMNLP 2020 long paper <Token-level Adaptive Training for Neural Machine Translation>.
TLAT-NMT
Source code for the EMNLP 2020 long paper <Token-level Adaptive Training for Neural Machine Translation>.
Related code
Implemented based on Fairseq-py, an open-source toolkit released by Facebook which was implemented strictly referring to Vaswani et al. (2017).
Requirements
This system has been tested in the following environment.
- OS: Ubuntu 16.04.1 LTS 64 bits
- Python version >=3.7
- Pytorch version >=1.0
Replicate the En-De results
Download the preprocessed WMT'16 EN-DE data provided by Google and preprocess it following the instrucition.
Pretrain the model for about 30 epochs.
bash train.ende.sh
Continual train the model based on the last checkpoint with the adaptive weights for about 15 epochs.
bash train.ende.ft.sh
Inference
$ python generate.py wmt16_en_de_bpe32k --path $SMODEL \
--gen-subset test --beam 4 --batch-size 128 \
--remove-bpe --lenpen 0.6 > pred.de \
# because fairseq's output is unordered, we need to recover its order
$ grep ^H pred.de | cut -f1,3- | cut -c3- | sort -k1n | cut -f2- > pred.de
Citation
@inproceedings{gu2020token,
title={Token-level Adaptive Training for Neural Machine Translation},
author={Gu, Shuhao and Zhang, Jinchao and Meng, Fandong and Feng, Yang and Xie, Wanying and Zhou, Jie and Yu, Dong},
journal={EMNLP2020},
year={2020}
}