gpu-manager
gpu-manager copied to clipboard
seems not working on newest A100 80G
Really a great solution for division of GPU.
This is working perfectly on my A100 40G cards, but when I take exactly same steps on A100 80G, the pod of daemonSet kept restarting with the following messages,
Back-off restarting failed container, and no log
only one line in the log for the pod
Warning: pause should be the first process
Anyone test on A100 80G ?
save with u. A100 80G did not be detected by gpu-manager, but A100 SXM 40G works good