gpushare-scheduler-extender
gpushare-scheduler-extender copied to clipboard
Other GPU sharing strategies besides bin packing
Are there other GPU sharing strategies, besides bin packing, supported? If not, would you consider adding those as a feature?
What I want to achieve is that my containers are spread evenly among the existing GPU's in the cluster, instead of filling one GPU before switching to the next. So, basically, placing a new container on the GPU with the least allocated memory, instead of the most allocated one.
The motivation is that I want to run one training per GPU. One workaround is (artificially) claiming the whole GPU memory: then no other container will ever be placed on the GPU. But that would block the entire GPU for other users of the cluster, which I'd want to avoid.
https://github.com/icefed/gpushare-scheduler-extender/tree/feat-strategy-spread
spread strategy.
set env NODE_GPU_DEVS_SCHEDULE_STRATEGY=spread
to use.