tensor2tensor
tensor2tensor copied to clipboard
T2T 1.15.7 version with Tensorflow 2.2 t2t-trainer produces additional model weights if trained on more that 1 GPU
Description
When working with t2t 1.15.7 on tensorflow 2.2 and performing training on 1 GPU the model weights are ~211M, but when we increase the # of GPUs the model weights increases to around 378M with 2 GPUs till 1.4G with 8 GPUs.
...
Environment information
OS: <your answer here>
Ubuntu 18.04.4 LTS
$ pip freeze | grep tensor
tensorboard==2.2.2
tensorboard-plugin-wit==1.7.0
tensorflow-addons==0.11.2
tensorflow-datasets==2.1.0
tensorflow-estimator==2.2.0
tensorflow-gan==2.0.0
tensorflow-gpu==2.2.0
tensorflow-hub==0.9.0
tensorflow-metadata==0.23.0
tensorflow-probability==0.7.0
$ python -V
Python 3.6.10 :: Anaconda, Inc
For bugs: reproduction and error logs
# Steps to reproduce:
PROBLEM=translate_ende_wmt32k
MODEL=transformer
HPARAMS=transformer_big
DATA_DIR=$PWD/t2t_data
TMP_DIR=/$PWD/t2t_datagen
TRAIN_DIR=$PWD/t2t_train/$PROBLEM/$MODEL-$HPARAMS
BEAM_SIZE=4
ALPHA=0.6
export PYTHONPATH=${PWD}:$PYTHONPATH
python3 t2t-trainer --data_dir=$DATA_DIR --problem=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS --output_dir=$TRAIN_DIR/bs3300 --hparams='batch_size=3300' --worker_gpu=8
--keep_checkpoint_max=20 --local_eval_frequency=1000 --train_steps=1000000 --eval_throttle_seconds=3600
# Error logs:
...