0/1 nodes are available: 1 Insufficient nvidia.com/gpu.
Fri Feb 17 16:56:54 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02 Driver Version: 510.85.02 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A40-4Q On | 00000000:02:00.0 Off | 0 |
| N/A N/A P0 N/A / N/A | 0MiB / 4096MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
我使用的k8s-device-plugin:1.9 并且 kubectl describe node 提示了nvidia.com/gpu 0 0 依旧会报错
插件上报错 2023/02/17 09:19:13 Starting to serve on /var/lib/kubelet/device-plugins/nvidia.sock 2023/02/17 09:19:13 Could not register device plugin: rpc error: code = Unimplemented desc = unknown service deviceplugin.Registration 2023/02/17 09:19:13 Could not contact Kubelet, retrying. Did you enable the device plugin feature gate? 2023/02/17 09:19:13 You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites 2023/02/17 09:19:13 You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start
This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed.