server icon indicating copy to clipboard operation
server copied to clipboard

DCGM unable to start: DCGM initialization error,Error: Failed to initialize NVML

Open coder-2014 opened this issue 5 months ago • 2 comments

Description Intermittent errors,After restarting, it can run successfully Triton Information nvcr.io/nvidia/tritonserver:24.05-py3

Running on Kubernetes, the container only has CPU

image

coder-2014 avatar Sep 29 '24 09:09 coder-2014