TLAT-NMT

Source code for the EMNLP 2020 long paper <Token-level Adaptive Training for Neural Machine Translation>.

Related code

Implemented based on Fairseq-py, an open-source toolkit released by Facebook which was implemented strictly referring to Vaswani et al. (2017).

Requirements

This system has been tested in the following environment.

OS: Ubuntu 16.04.1 LTS 64 bits
Python version >=3.7
Pytorch version >=1.0

Replicate the En-De results

Download the preprocessed WMT'16 EN-DE data provided by Google and preprocess it following the instrucition.

Pretrain the model for about 30 epochs.

bash train.ende.sh

Continual train the model based on the last checkpoint with the adaptive weights for about 15 epochs.

bash train.ende.ft.sh

Inference

$ python generate.py wmt16_en_de_bpe32k --path $SMODEL \
    --gen-subset test --beam 4 --batch-size 128 \
    --remove-bpe --lenpen 0.6 > pred.de \
# because fairseq's output is unordered, we need to recover its order
$ grep ^H pred.de | cut -f1,3- | cut -c3- | sort -k1n | cut -f2- > pred.de

Citation

@inproceedings{gu2020token,
  title={Token-level Adaptive Training for Neural Machine Translation},
  author={Gu, Shuhao and Zhang, Jinchao and Meng, Fandong and Feng, Yang and Xie, Wanying and Zhou, Jie and Yu, Dong},
  journal={EMNLP2020},
  year={2020}
}

TLAT-NMT
TLAT-NMT copied to clipboard

Metadata

TLAT-NMT

Related code

Requirements

Replicate the En-De results

Citation

← Metadata

Owner

Metadata

TLAT-NMT TLAT-NMT copied to clipboard

Metadata

TLAT-NMT

Related code

Requirements

Replicate the En-De results

Citation

← Metadata

Owner

Metadata

TLAT-NMT
TLAT-NMT copied to clipboard