k8s-device-plugin Is there any way in the meantime to request more than 1 replica from each GPU in my node?

I have started MPS and used 10 as the division factor, but in our application scenario, we might directly allocate 2 whole GPUs, which is equivalent to specifying nvidia.com/gpu: 20. If I set nvidia.com/gpu > 1, I encounter the error: ‘request for “nvidia.com/gpu”: invalid request: maximum request size for shared resources is 1; found 10, which is unexpected’.

Is there any way in the meantime to request more than 1 replica from each GPU in my node?

Aug 27 '24 09:08 wei1793786487

My configuration file is version: v1 sharing: mps: resources: - name: nvidia.com/gpu replicas: 10

Aug 27 '24 09:08 wei1793786487

If failRequestsGreaterThanOne=true were set in either of these configurations (MPS or TimeSlicing) and a user requested more than one nvidia.com/gpu or nvidia.com/gpu.shared resource in their pod spec, then the container would fail with the resulting error you've seen.

Sep 25 '24 06:09 agrogov

I also want to know the answer to this question, you can apply for multiple gpu resources when you do not use mps, but you cannot apply for multiple gpu resources once you use mps @agrogov

Nov 14 '24 07:11 ZYWNB666

@ZYWNB666 as I got failRequestsGreaterThanOne is always set to true when MPS is used, so no way to change this behaviour...

Nov 14 '24 22:11 agrogov

any one has idea?

Mar 18 '25 08:03 arthas3014

Saw there was an attempt to add supporting more than 1 MPS but it then got closed https://github.com/NVIDIA/k8s-device-plugin/pull/586

Jun 20 '25 07:06 gfrankliu