awsome-distributed-training
awsome-distributed-training copied to clipboard
Slurm job template: how a job can probe instance topology and hostname-instanceid mappings…
Issue #, if available: N/A
Description of changes: a sample template on writing Slurm job that probes ec2 informations, so that job logs contain as much info as possible for later analysis.
- check instance topoloty
- display the mapping between hostname (of allocated nodes) and their instance id.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.