DeepSpeedExamples icon indicating copy to clipboard operation
DeepSpeedExamples copied to clipboard

Step1 training failed

Open omoiji opened this issue 2 years ago • 1 comments

image image image My card is v100, but I get this error when running the training script for step1

omoiji avatar Apr 17 '23 06:04 omoiji

Can you share more of the stack trace this is coming from? The hipconfig makes me think this is AMD but your nvidia-smi implies you're running on nvidia?

jeffra avatar Apr 18 '23 18:04 jeffra