gpushare-scheduler-extender icon indicating copy to clipboard operation
gpushare-scheduler-extender copied to clipboard

Other GPU sharing strategies besides bin packing

Open aakhundov opened this issue 5 years ago • 1 comments

Are there other GPU sharing strategies, besides bin packing, supported? If not, would you consider adding those as a feature?

What I want to achieve is that my containers are spread evenly among the existing GPU's in the cluster, instead of filling one GPU before switching to the next. So, basically, placing a new container on the GPU with the least allocated memory, instead of the most allocated one.

The motivation is that I want to run one training per GPU. One workaround is (artificially) claiming the whole GPU memory: then no other container will ever be placed on the GPU. But that would block the entire GPU for other users of the cluster, which I'd want to avoid.

aakhundov avatar Oct 09 '19 15:10 aakhundov

https://github.com/icefed/gpushare-scheduler-extender/tree/feat-strategy-spread spread strategy. set env NODE_GPU_DEVS_SCHEDULE_STRATEGY=spread to use.

icefed avatar Nov 13 '19 14:11 icefed