tensor2tensor
tensor2tensor copied to clipboard
Reproduce English to German Machine Translation
Description
I am unable to reproduce the english to german machine translation using either transformers or universal transformers. I am only able to get around 20 BLEU score for both on newstest2013.
Reproducing Code
PROBLEM=translate_ende_wmt32k
MODEL=transformer
HPARAMS=transformer_base_single_gpu
DATA_DIR=$HOME/tensor2tensor/t2t_data
TMP_DIR=/miniscratch/mittalsa/tmp/t2t_datagen
TRAIN_DIR=$HOME/tensor2tensor/t2t_train/$PROBLEM/$MODEL-$HPARAMS
train_steps=500000 # Total number of train steps for all Epochs
eval_steps=1000 # Number of steps to perform for each evaluation
save_checkpoints_steps=1000
schedule="continuous_train_and_eval"
t2t-trainer \
--data_dir=$DATA_DIR \
--problem=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--schedule=$schedule\
--output_dir=$TRAIN_DIR \
--train_steps=$train_steps \
--worker-gpu=1 \
--eval_steps=$eval_steps
And I did model universal_transformer and hparams universal_transformer_base for the Universal Transformer part. Then for evaluation, I did
PROBLEM=translate_ende_wmt32k
MODEL=transformer
HPARAMS=transformer_base
DATA_DIR=$HOME/tensor2tensor/t2t_data
TMP_DIR=/miniscratch/mittalsa/tmp/t2t_datagen/dev
TRAIN_DIR=$HOME/tensor2tensor/t2t_train/$PROBLEM/$MODEL-$HPARAMS
DECODE_FILE=$TMP_DIR/newstest2013.en
REF_FILE=$TMP_DIR/newstest2013.de
t2t-translate-all \
--data_dir=$DATA_DIR \
--problem=$PROBLEM \
--model $MODEL \
--model_dir=$TRAIN_DIR \
--output_dir=$TRAIN_DIR \
--hparams_set=$HPARAMS \
--output_dir=$TRAIN_DIR \
--decode_hparams="beam_size=4,alpha=0.6" \
--source=$DECODE_FILE \
--translations_dir=t2t_translations/$PROBLEM/$MODEL-$HPARAMS \
--t2t_usr_dir=$HOME
t2t-bleu \
--translations_dir=t2t_translations/$PROBLEM/$MODEL-$HPARAMS \
--reference=$REF_FILE \
--event_dir=events/$PROBLEM/$MODEL-$HPARAMS
I am seeing the outputs for both transformer and universal transformer as around 20-21 BLEU points on newstest2013 data. What should I be changing to be able to reproduce both transformer and universal transformer results?