DeepSpeedExamples
DeepSpeedExamples copied to clipboard
error running DeepSpeedExamples/BingBertSquad/run_squad_deepspeed.sh
Error encountered running DeepSpeedExamples/BingBertSquad/run_squad_deepspeed.sh
$ ./run_squad_deepspeed.sh 16 ~/models/bert-base-uncased/pytorch_model.bin ~/datasets/squad_data ~/output
11/23/2021 15:42:24 - INFO - __main__ - Loading Pretrained Bert Encoder from: /home/bduser/models/bert-base-uncas/pytorch_model.bin
VOCAB SIZE: 30528
Traceback (most recent call last):
File "nvidia_run_squad_deepspeed.py", line 1163, in <module>
11/23/2021 15:42:24 - INFO - __main__ - Loading Pretrained Bert Encoder from: /home/bduser/models/bert-base-uncas/pytorch_model.bin
main()
File "nvidia_run_squad_deepspeed.py", line 842, in main
raise ValueError("Unable to find model state in checkpoint")
ValueError: Unable to find model state in checkpoint
Running into the same error (after addressing a host of other errors). Any resolution?