flux-sched
flux-sched copied to clipboard
Support optimizations of Rabbit allocations based on HPE co-design heuristics
trafficstars
Rabbit Load Balancing
- Ensure that allocations are spread as broadly across the rabbits as possible (no rabbit "hotspots")
- Ensure that MDTs from different jobs are distributed as much as possible (MDT is very CPU intensive)
DragonFly Topology
- Start with allowing scheduling within just a single pod versus free-form across the entire cluster
- Move towards minimizing the distance between compute node pods and rabbit pods
Rabbit Spread Policy
- Allow a single job to force multiple rabbits to be allocated to it (for bandwidth) without requiring allocating all of the storage on those rabbits
Dragonfly topology optimization needs to be flexible per-job. If the values are "within a pod", "same rack as nodes", "no constraints", we will want to be able to tune that per-job. (This could be an actual part of the jobspec resource section).
Same goes for maximizing the packing of rabbits into minimal # of pods and minimizing distance between nodes and rabbits. Some users will want that and others won't.
Prototype Rabbit Load Balancing PR: https://github.com/flux-framework/flux-sched/pull/812