awsome-distributed-training
awsome-distributed-training copied to clipboard
Extra containerized nccl tests
Issue #, if available: N/A
Description of changes: sample .sbatch scripts to run nccl tests under containers. Two variants: native implementation, and a pure pytorch-based (that some of our customers have been using for benchmarking).
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.