Christopher Desiniotis comments

Results 156 comments of


                                            Christopher Desiniotis

Centos 7. nvidia-driver pod "Could not resolve Linux kernel version"

Hi everyone, This issue is not a bug with the nvidia driver-container. The driver-container requires that the kernel-headers for the running kernel are present and can be accessed by the...

failure to install on Rocky Linux 8.4 (CentOS 8 clone)

We do not support Rocky Linux. Please refer to our platform support page for all the operating systems we currently support: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/platform-support.html#linux-distributions

failure to install on Rocky Linux 8.4 (CentOS 8 clone)

Could you provide logs of the dcgm pod? Also, can you try deploying the operator again with dcgm disabled `--set dcgm.enabled=false`

failure to install on Rocky Linux 8.4 (CentOS 8 clone)

The dcgm pod `nvidia-dcgm-kfqch` starts it, and the dcgm-exporter pod gets gpu metrics from it and exports them to prometheus

VCS licenses are acquired per-cluster rather than per-gpu

Hi @kralicky -- this is the expected behavior. You need a license per VM. It looks like you have 4 VMs, and so 4 licenses should be leased.

1.9.0 chart issue with repo-config in air-gapped env

Hi @yug0slav Can you try naming your repo configuration files `CentOS-Vault.repo` and `cuda.repo` and retry this? Naming them as such will replace the existing repo configuration files in the driver...

1.9.0 chart issue with repo-config in air-gapped env

Thanks for more details. If I am understanding you correctly, there are two issues. Correct me if I am wrong. 1. On CentOS 7, you have to name your repo...

1.9.0 chart issue with repo-config in air-gapped env

@chrisholzheimer > apt seem to only raise a warning, but it seems nvidia-driver-daemonset pod still need packages from official repos to work: Yes, these packages are required for the driver...

repoConfig to override /etc/apt/sources.list is not working

> but it does not work as expected: default sources.list stays in use as well. Can you provide driver logs for this case? It is expected behavior for the default...

repoConfig to override /etc/apt/sources.list is not working

Can you confirm that `/etc/apt/sources.list.d/` gets created inside the driver container and your custom repo file can be found in that directory? Edit: Can you also confirm that your repo...