HPC icon indicating copy to clipboard operation
HPC copied to clipboard

Node packing

Open KevinSayers opened this issue 3 years ago • 1 comments

Based on discussion with @diskwarrior document or have a workshop on strategies for packing multiple jobs into a node.

KevinSayers avatar Apr 30 '21 20:04 KevinSayers

I looked into this, but was not personally able to get multiple jobs to run on a node. As I understand the SLURM documentation, if the OVERSUBSCRIBE feature on the partition is set to EXCLUSIVE, this setting takes precedence over a jobs setting (i.e., you can't just submit with various options such as -s or --overcommit). If we wanted to allow this easily, we would need to change the OVERSUBSCRIBE partition setting.

There is a FAQ entry on SLURM which details some of the changes that would need to be made:

https://slurm.schedmd.com/faq.html#sharing

There may be other ways, but I suppose one alternative would be to package jobs into Python scripts and use MPI.

Not sure if there are other thoughts, and let me know if someone has a different experience.

gainesdp avatar Jun 09 '23 20:06 gainesdp