Shiva Krishna Merla
Shiva Krishna Merla
> helm install --wait --generate-name > ./gpu-operator \ > --set nfd.enabled=false \ (because I have deployed above) > --set operator.defaultRuntime=crio > --set driver.enabled=false (because I have install on the local...
@william0212 cuda-validator pod doesn't download cuda images, we have `vectorAdd` sample within `gpu-operator-validator` image which gets invoked at runtime. Wondering if cuda 11.4.1 package installed directly on host is causing...
@dioguerra Please pull the operator charts from here: https://catalog.ngc.nvidia.com/orgs/nvidia/helm-charts/gpu-operator
@kelonsen did you look into the dcgm-exporter metrics collected for memory utilization? https://github.com/NVIDIA/gpu-monitoring-tools/blob/master/etc/dcgm-exporter/dcp-metrics-included.csv Also, metrics are mapped to pod-level resources to track the usage per pod(pod-name, namespace, device-id). This [blog](https://developer.nvidia.com/blog/monitoring-gpus-in-kubernetes-with-dcgm/)...
Thanks for the feature request. This will indeed be a great feature. Currently only way this can be done is with pre-installed drivers on the host with GPU operator. We...
@khatrig, we currently package a single driver version into each image, hence the requirement to have separate daemonsets.
@jear Will update you on this, we don't have official driver images for SLES15 yet and as @dualvtable mentioned its in the works.
@jear We are working with SUSE on releasing official image for SLES(planned for GPU operator 1.11). Meanwhile from the link you have shared above, looks like we need changes to...
@jear With the upcoming release we are planning to support RKE2 with Ubuntu and RHEL8. Toolkit config required for ubuntu would be as below. SLES support is still being reviewed...
@AStrangwood yes, this is a known limitation with GPU operator today and we are looking to support mixed mode soon.