nos Cannot use entire gpu memory

Cannot use entire gpu memory

Open ettelr opened this issue 1 year ago • 0 comments

Hi,

I have an A100-PCIE-40GB gpu and I an trying to use nos mps dynamic partitioning. The issue is that Is seems to have some issues with total capacity calculation. for example I am trying to run 2 pods that require resource : nvidia.com/gpu-20gb: 1 and one of them always stays in pending while I am able to schedule 1 pod of nvidia.com/gpu-20gb and another 2 pods requesting nvidia.com/gpu-10gb I faced this issue of not fully using the GPU memory in some more combinations.

Does someone have any idea? will be much appreciated

Oct 30 '23 13:10 ettelr

nos nos copied to clipboard

Cannot use entire gpu memory

nos
nos copied to clipboard