k8s-device-plugin icon indicating copy to clipboard operation
k8s-device-plugin copied to clipboard

Is there any way in the meantime to request more than 1 replica from each GPU in my node?

Open wei1793786487 opened this issue 1 year ago • 6 comments

Image

I have started MPS and used 10 as the division factor, but in our application scenario, we might directly allocate 2 whole GPUs, which is equivalent to specifying nvidia.com/gpu: 20. If I set nvidia.com/gpu > 1, I encounter the error: ‘request for “nvidia.com/gpu”: invalid request: maximum request size for shared resources is 1; found 10, which is unexpected’.

Is there any way in the meantime to request more than 1 replica from each GPU in my node?

wei1793786487 avatar Aug 27 '24 09:08 wei1793786487

My configuration file is version: v1 sharing: mps: resources: - name: nvidia.com/gpu replicas: 10

wei1793786487 avatar Aug 27 '24 09:08 wei1793786487

If failRequestsGreaterThanOne=true were set in either of these configurations (MPS or TimeSlicing) and a user requested more than one nvidia.com/gpu or nvidia.com/gpu.shared resource in their pod spec, then the container would fail with the resulting error you've seen.

agrogov avatar Sep 25 '24 06:09 agrogov

I also want to know the answer to this question, you can apply for multiple gpu resources when you do not use mps, but you cannot apply for multiple gpu resources once you use mps @agrogov

ZYWNB666 avatar Nov 14 '24 07:11 ZYWNB666

@ZYWNB666 as I got failRequestsGreaterThanOne is always set to true when MPS is used, so no way to change this behaviour...

agrogov avatar Nov 14 '24 22:11 agrogov

any one has idea?

arthas3014 avatar Mar 18 '25 08:03 arthas3014

Saw there was an attempt to add supporting more than 1 MPS but it then got closed https://github.com/NVIDIA/k8s-device-plugin/pull/586

gfrankliu avatar Jun 20 '25 07:06 gfrankliu