gpu-operator icon indicating copy to clipboard operation
gpu-operator copied to clipboard

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

Results 392 gpu-operator issues
Sort by recently updated
recently updated
newest added

[gpu-operator-issue1.txt](https://github.com/NVIDIA/gpu-operator/files/13481574/gpu-operator-issue1.txt)

### Issue or feature description My K8s Cluster has 2 nodes with NVIDIA GPU: 1. node1 containerRuntime is docker 2. node2 containerRuntime is containerd ```bash $ k get node -o...

I have installed confidential-container and gpu-operator following https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-operator-kata.html. When I create a pod using runtimeClass `kata-qemu-nvidia-gpu`: ``` Events: Type Reason Age From Message Warning FailedCreatePodSandBox 0s (x14 over 13s) kubelet...

_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._...

### 1. Quick Debug Checklist * Are you running on an Ubuntu 18.04 node? No, I am running **Red Hat Enterprise Linux 8.6 (Ootpa)** * Are you running Kubernetes v1.13+?...

_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._...

https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/23.6.0/getting-started.html ### 1. Quick Debug Information * BAREMETAL * OS/Version:Ubuntu 22.04.3 LTS * Container Runtime Type/Version: containerd * K8s Flavor/Version: Rancher RKE2 v1.25.12+rke2r1 * GPU Operator Version: nvidia gpu-operator-v23.6.0 ###...

_The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense._...

mig-config.yaml ``` mig-configs: custom-config: - devices: [0] mig-enabled: false - devices: [1] mig-enabled: true mig-devices: "7g.80gb": 1 - devices: [2] mig-enabled: true mig-devices: "2g.20gb": 3 - devices: [3] mig-enabled: true...

Hi everyone I face an issue with gpu-operator and scaling of my K8S cluster When adding a GPU node to cluster, gpu-operator will, amon others things, install container runtime and...