submitit icon indicating copy to clipboard operation
submitit copied to clipboard

Support Slurm Heterogeneous Job

Open sunshine-syz opened this issue 2 years ago • 2 comments

Does submitit support Slurm Heterogeneous Job? If so, how can we submit heterogeneous job? If not, could you enhance the code to support it?

sunshine-syz avatar Sep 15 '23 05:09 sunshine-syz

it's not supported atm, and from an API perspective I'm not sure how to handle this. currently the api assumes there is one configuration per job, while here you want several configurations in the same job. Not impossible, but also non trivial. What's the use case ? can you approximate this by starting two jobs ?

gwenzek avatar Sep 18 '23 14:09 gwenzek

For example, if you want to start a distributed job running on two different GPUs or CPUs with different specs, and they need to communicate with each other and they cannot be started separately.

Here is one example: https://research-computing.git-pages.rit.edu/docs/slurm_tutorial_2.html https://slurm.schedmd.com/heterogeneous_jobs.html#submitting

sunshine-syz avatar Sep 18 '23 18:09 sunshine-syz