serve icon indicating copy to clipboard operation
serve copied to clipboard

GPU sharing support across models

Open abhinav-cashify opened this issue 3 years ago • 3 comments

🚀 The feature

Need feature for sharing the GPU across models. It can be configured by setting the 0 < workers < 1 for a model.

Motivation, pitch

Currently one of my models is using 15% GPU utilization. I want multiple models to share GPU simultaneously. So that rest of the GPU can be utilized at the same time by other models.

Alternatives

No response

Additional context

No response

abhinav-cashify avatar Jul 26 '22 18:07 abhinav-cashify

@msaroufim

amit-cashify avatar Jul 27 '22 08:07 amit-cashify

@abhinav-cashify @amit-cashify this is already on our roadmap.

lxning avatar Jul 27 '22 16:07 lxning

@abhinav-cashify @amit-cashify out of curiosity what kind of GPU are you using? Starting from Ampere, NVIDIA has added support for MIG to allow resource isolation and allocate partial GPUs. We would just need to make sure to pass in the right device id from config.properties to the handler https://docs.nvidia.com/datacenter/tesla/mig-user-guide/

Here's how to use it in your PyTorch code https://discuss.pytorch.org/t/access-gpu-partitions-in-mig/142272

If you get it to work I'd be happy to merge your contribution otherwise can look into this for our next sprint

msaroufim avatar Aug 15 '22 23:08 msaroufim