DeepSpeed
DeepSpeed copied to clipboard
Issue in multi-node training with Slurm
I am trying to train models on multiple nodes with DeepSpeed. Any resource for that?
Seems like this PR #2404 was merged into the main but can't find any documentation on how to use it. Kindly help. cc: @tjruwase @RezaYazdaniAminabadi @HeyangQin