DeepSpeedExamples
DeepSpeedExamples copied to clipboard
"ds_train_bert_nvidia_data_bsz64k_seq128.sh" program stalls at the end of the first epoch
When I run "ds_train_bert_nvidia_data_bsz64k_seq128.sh". It stalls at the end of the first epoch.