k8s-device-plugin icon indicating copy to clipboard operation
k8s-device-plugin copied to clipboard

Using CUDA MPS to enable GPU sharing in K8S, error:error checking MPS daemon health

Open zhangQiWorr opened this issue 1 year ago • 2 comments

image

image

image

zhangQiWorr avatar Apr 11 '24 10:04 zhangQiWorr

Should I start MPS daemon(nvidia-cuda-mps-control) in the k8s node?

zhangQiWorr avatar Apr 11 '24 10:04 zhangQiWorr

@zhangQiWorr are you deploying the device plugin using helm? The use of MPS in the device plugin requires both GFD and an additional component to manage the lifecycle of the MPS control daemon. Here the use of helm is recommended since this deploys the relevant daemonsets.

elezar avatar Apr 11 '24 11:04 elezar

This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed.

github-actions[bot] avatar Jul 11 '24 04:07 github-actions[bot]

This issue was automatically closed due to inactivity.

github-actions[bot] avatar Aug 11 '24 04:08 github-actions[bot]