Change GFD repository image V0.15.0 Helm
I don't see any option in the Helm chart to change the repository of the GFD image.
It can be useful in a company network where the cluster don't have any access to the internet.
We do have https://github.com/NVIDIA/k8s-device-plugin/blob/main/deployments/helm/nvidia-device-plugin/values.yaml#L48 , isn't this what you look for?
Yeah it change the k8s-device-plugin repository but not the GFD repository, the deployement try to pull the GFD pod from the basic repository.
they share the same image (https://github.com/NVIDIA/k8s-device-plugin/blob/main/deployments/container/Dockerfile.ubuntu#L72) , and we have https://github.com/NVIDIA/k8s-device-plugin/blob/main/deployments/helm/nvidia-device-plugin/templates/daemonset-gfd.yml#L134
Thanks ! I will try this tomorow but I think we can close this issue!
I'll close it once it works for you :) , not before
Actually i'm working with @YFrendo to deploy this plugin on a brand new airgaped k8s GPU infrastructure and i did override this setting to our mirrored image hub which worked for the k8s-device-plugin. This seems to be working.
The issue relate indeed on the node discovery feature which is deployed from a separated helm chart located here : https://github.com/NVIDIA/k8s-device-plugin/tree/main/deployments/helm/nvidia-device-plugin/charts When i try to enable the GFD from the k8s-plugin helm chart (here => https://github.com/NVIDIA/k8s-device-plugin/blob/925be6d97361359803eb6502d15fa3e69dbe6e2b/deployments/helm/nvidia-device-plugin/values.yaml#L106C3-L106C17), the created pods are trying to pull an image from registry.k8s.io/nfd/node-feature-discovery:v0.15.3 or something like that ; and so far i didn't manage to override that location for the image. Nor from the original chart of the k8s-plugin, nor even when i use the separate helm chart and override it with the appropriated value (inside this https://github.com/NVIDIA/k8s-device-plugin/blob/main/deployments/helm/nvidia-device-plugin/charts/node-feature-discovery-chart-0.15.3.tgz there is the value template)
@YFrendo if your are able to install NFD separately, you could pass --set nfd.enabled when installing the device plugin and /or gfd. This should disable the internal nfd dependency.
@YFrendo if your are able to install NFD separately, you could pass --set nfd.enabled when installing the device plugin and /or gfd. This should disable the internal nfd dependency.
This is the solution, in order to get it work in a restrictive environnement you have to first install NFD separately.
Everything work for us now!
But maybe it should be more explicit in the documentation (or add an nfd.image in the chart) Also nfd.enabled can be add in the helm chart exemple !
https://github.com/NVIDIA/k8s-device-plugin/blob/v0.15.0/deployments/helm/nvidia-device-plugin/values.yaml
Thanks for your support !
This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed.
This issue was automatically closed due to inactivity.