DeepSpeedExamples
DeepSpeedExamples copied to clipboard
Step1 training failed
My card is v100, but I get this error when running the training script for step1
Can you share more of the stack trace this is coming from? The hipconfig makes me think this is AMD but your nvidia-smi implies you're running on nvidia?