elastic-gpu-agent
elastic-gpu-agent copied to clipboard
elastic-gpu-agent is a Kubernetes device plugin for GPU resources allocation on node.
### Issue Deploy pod, if gpu limit set 2 cards, report error; if gpu limit 1 card, succes running ### Resource setting elasticgpu.io/gpu-core: "200" ### Error message Error: failed to...
Hello: Is egpu-nvidia-container-toolkit opensource ? What is the difference between egpu-nvidia-container-toolkit and nvidia-container-toolkit?
1. Migrate codes to the new framework. 2. Fix the golint errors. 3. Support gpu crd creation when the agent starts.
kubectl -n kube-system logs elastic-gpu-agent-vjx7h Defaulted container "elastic-gpu-agent" out of: elastic-gpu-agent, elastic-gpu-installer (init) panic: error opening libnvidia-ml.so.1: libnvidia-ml.so.1: cannot open shared object file: No such file or directory goroutine 1...