DeepSpeedExamples
DeepSpeedExamples copied to clipboard
BingBertSQuAD Fine-tuning result mismatch tutorial document
I rerun the shell file run_squad_baseline.sh
under BingBertSquad without any modification on 8 gpus V100, the pretrain mode is https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin, but I didn't get right results.
The document says for default config, the result should be EM: 87.27, F1: 93.33, but I got {"exact_match": 8.136234626300851, "f1": 16.67697307405455}. and for PER_GPU_BATCH_SIZE=3, traing speed is 8.31it/s, means samples/second is about 24.93, which is aroud 36.34 in document.
Did I miss something?
i met the same problem, during the fine-tuning, the gradient overflow so that all iterations are skiped