gpu-operator
gpu-operator copied to clipboard
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
### 1. Quick Debug Checklist - [ ] Are you running on an Ubuntu 18.04 node? ---> No. Amazon Linux 2 - [x] Are you running Kubernetes v1.13+? ---> v1.21...
_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._...
The issue is still reproduced in gpu-operator v22.9.0. `kubectl --kubeconfig -n gpu logs cuda-vectoradd` ``` Failed to allocate device vector A (error code CUDA driver version is insufficient for CUDA...
We are using **GKE** cluster(v1.23.8-gke.1900) with Nvidia multi-instance **A100 Gpu** nodes. We want to install Nvidia Gpu-operator on this cluster. The default container-runtime in our case is **containerd**, so we...
### 1. Quick Debug Checklist * Are you running on an Ubuntu 18.04 node? No, I am running **Red Hat Enterprise Linux CoreOS 410.84.202209231843-0** * Are you running Kubernetes v1.13+?...
_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._...
Title says it all.
We were running good on stable 1.6.2 with driver config to version "450.80.02", but after OCP cluster minor version upgraded to 4.6.60, we saw the pods of nvidia-container-toolkit in GPU...
_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._...
_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._...