ansible-devops icon indicating copy to clipboard operation
ansible-devops copied to clipboard

Can't pull predefined version of NVIDIA GPU driver (nvidia_gpu role)

Open SalaryTheft opened this issue 1 year ago • 2 comments

https://github.com/ibm-mas/ansible-devops/blob/a4c2c3b04e4967bdc5da62b00bec2b0923aa7b33/ibm/mas_devops/roles/nvidia_gpu/defaults/main.yml#L10

스크린샷 2024-07-12 163442

I updated gpu-cluster-policy (ClusterPolicy kind) to use latest driver and it helped.

SalaryTheft avatar Jul 12 '24 07:07 SalaryTheft

Just noticed RHOCS 4.14 is based on RHEL 9.2 https://access.redhat.com/articles/6907891


edit) For a workaround, use the GPU_DRIVER_VERSION environment variable to manually define the driver version.

export GPU_DRIVER_VERSION=550.54.14

SalaryTheft avatar Jul 12 '24 07:07 SalaryTheft

Looking into this now for you

JonahLuckett avatar Jul 23 '24 11:07 JonahLuckett

https://jsw.ibm.com/browse/MASCORE-4342

durera avatar Nov 01 '24 22:11 durera