k8s-device-plugin
k8s-device-plugin copied to clipboard
Requesting zero GPUs allocates all GPUs
The README.md states:
WARNING: if you don't request GPUs when using the device plugin with NVIDIA images all the GPUs on the machine will be exposed inside your container.
I discovered a workaround for this, which is to set the environment variable NVIDIA_VISIBLE_DEVICES
to none
in the container spec.
With a resource request for nvidia.com/gpu: 0
this environment variable should be set automatically.
With a resource request for
nvidia.com/gpu: 0
this environment variable should be set automatically.
currently, device plugin doesn't have the ability to inject env vars to pods.
However, You can implement this feature with Admission Webhook. Just writing small web server to mutate the env var to user definitions. I think it's not so difficult. (Actually, I did the same thing in our cluster.)
Does it mean if I have two containers both requesting for nvidia.com/gpu: 0, then they could share the GPU?
@yukuo78 basically yes, this is equal to the node-selector trick to share GPUs, as described in: https://github.com/kubernetes/kubernetes/issues/52757#issuecomment-410419952
go check the follow-ups on the thread for more information.
@dhague it is prerequisite for both nvidia.com/gpu: 0 and NVIDIA_VISIBLE_DEVICES env var should be set together, isn't it? recently, if only nvidia.com/gpu: 0 is set , related pod scheduled on GPU node probably crash resulting from "OutOfnvidia.com/gpu", its status specially looks like
status:
message: 'Pod Node didn''t have enough resource: nvidia.com/gpu, requested: 0, used:
1, capacity: 0'
phase: Failed
reason: OutOfnvidia.com/gpu
startTime: "2019-05-09T03:05:49Z"
@everpeace could you share your custom Admission Webhook code, especially the part that mutates NVIDIA_VISIBLE_DEVICES ?
I 've tested. If we add NVIDIA_VISIBLE_DEVICES=none
to pod.spec.containers[*].env
, then a pod which want to use 1 GPU by k8s's container resource requests, the envionment list when nvidia-container-runtime was executed will be(the order is important):
NVIDIA_VISIBLE_DEVICES=GPU-xxx-xxx-xxx-xxx-xxx
NVIDIA_VISIBLE_DEVICES=none
And The nvidia-container-runtime may take the last one to decide which devices to mount, then will result in no devices avaiable in container which is not expected
. This action is up to the version of nvidia-container-runtime-hook(renamed to nvidia-container-toolkit recently) you use, please referer to this)
@Cas-pian what you meant was two pods setup respectively with NVIDIA_VISIBLE_DEVICES=none and nvidia.com/gpu:1 on same node, wasn't that?
@Cas-pian what you meant was two pods setup respectively with NVIDIA_VISIBLE_DEVICES=none and nvidia.com/gpu:1 on same node, wasn't that?
@Davidrjx no, just I found a bug of nvidia-container-runtime-hook(nvidia-container-tolkit): multi NVIDIA_VISIBLE_DEVICES
envs which is not handled, and will make GPUs not mounted as expected.
step 1: I use a cuda image has env NVIDIA_VISIBLE_DEVICES=all to start a pod (without setting resources.requests for GPU), then all GPUs will be mounted into the container, this will make k8s-device-plugin useless, and break the environment of pod who use resources.requests for GPU.
step 2: In order to fix the problem in step 1, I add NVIDIA_VISIBLE_DEVICES=none
to pod.spec.containers[*].env to disable the default value of NVIDIA_VISIBLE_DEVICES
in image, but what I saw is no GPU was mounted into the pod even I use resource.requests to use GPU, even if you use resources.requests!!
And finally I found it's not a good design to use the same logic(env NVIDIA_VISIBLE_DEVICES) for single node GPU allocation and cluster GPU allocation, because CUDA images are made for single node usage, it's better to use a diffenent logics(eg: different envs).
@flx42
@Cas-pian oh, now i understand what you mean.
I wrote a Kubernetes Mutating Admission Webhook called gpu-admission-webhook to handle this case. It sets NVIDIA_VISIBLE_DEVICES to "none" if you do not request a GPU. It also deletes environment variables that would cause issues or bypass this constraint.
After reading the documentation about NVIDIA_VISIBLE_DEVICES
I advices you to set void
instead of none
.
From the doc:
nvidia-container-runtime will have the same behavior as runc (i.e. neither GPUs nor capabilities are exposed)
I've tried to set:
resources:
limits:
nvidia.com/gpu: 0
My idea is to have multiple pods on the same node sharing single GPU. But it looks like in such case, the app in container does not utilise GPU at all. What am I missing?
This is no longer an issue if you have the following lines in your /etc/nvidia-container-runtime/config.toml
accept-nvidia-visible-devices-envvar-when-unprivileged = false
accept-nvidia-visible-devices-as-volume-mounts = true
And you deploy the nvidia-device-plugin with the values
compatWithCPUManager: true
deviceListStrategy: volume-mounts
@ktarplee thanks for the clue!
Talking about /etc/nvidia-container-runtime/config.toml
. I have a container build on top of tensorflow/tensorflow:1.14.0-gpu-py3
but see no config.toml
. Or where it should be edited?
Needs to be set on the host, not inside a container.
Here’s a link to the details: https://docs.google.com/document/d/1zy0key-EL6JH50MZgwg96RPYxxXXnVUdxLZwGiyqLd8/edit
@orkenstein the config file mentioned is installed on every host along with the NVIDIA Container Toolkit / NVIDIA Docker.
Thanks @klueska @elezar
I'm not sure how to do that on GCloud. Should I tweak nvidia-installer
somehow?
@orkenstein does that mean that you're not using the NVIDIA Device Plugin to allow GPU usage on GCloud but using instead?
(could you provide a link to the nvidia-installer
you mention).
@elezar drivers gets installed like this: https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#installing_drivers
@orkenstein GKE does not (currently) use the NVIDIA device plugin nor the NVIDIA container toolkit. Which means that the suggestion by @ktarplee is not applicable to you.
@orkenstein GKE does not (currently) use the NVIDIA device plugin nor the NVIDIA container toolkit. Which means that the suggestion by @ktarplee is not applicable to you.
Ah, okay. What should I do then?
This is unfortunately not something that I can help with. You could try post your request here https://github.com/GoogleCloudPlatform/container-engine-accelerators/issues (which contains the device plugin used on GKE systems).
This is no longer an issue if you have the following lines in your
/etc/nvidia-container-runtime/config.toml
accept-nvidia-visible-devices-envvar-when-unprivileged = false accept-nvidia-visible-devices-as-volume-mounts = true
And you deploy the nvidia-device-plugin with the values
compatWithCPUManager: true deviceListStrategy: volume-mounts
Thanks for this solution. However, I'm deploying https://github.com/NVIDIA/gpu-operator to my k3s cluster with a docker backend, using gpu-operator to install the container runtime. Is it possible to inject this configuration into the helm deployment?
@sjdrc Currently its not possible to set these parameters through gpu-operator helm deployment as the toolkit container doesn't support configuring these yet. We will look into adding this support. Meanwhile, these need to be added manually to /usr/local/nvidia/toolkit/.config/config.toml
file, but device-plugin settings can be configured through --set devicePlugin.env[0].name=DEVICE_LIST_STRATEGY --set devicePlugin.env[0].value="volume-mounts"
parameters during operator install. compatWithCPUManager
setting is already default through gpu-operator deployment.
Thanks for your prompt reply.
So just to clarify, I should configure /usr/local/nvidia/toolkit/.config/config.toml
on the host, and by setting volume-mounts, the device plugin will use the host configuration?
I do not have that file, but I do have /usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml
Hey, I'm still having issues getting this working.
- Should the config changes go into
/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml
- This file is present on my host, but not/usr/local/nvidia/toolkit/.config/config.toml
- What section in the config file do these changes go? I have a top level section,
[nvidia-container-cli]
, and[nvidia-container-runtime]
- How can I make these persist? Every time I restart k3s the file content gets reverted.
Adding a bit more information about my setup process (from clean)
Hey, I'm still having issues getting this working.
- Should the config changes go into
/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml
- This file is present on my host, but not/usr/local/nvidia/toolkit/.config/config.toml
Sorry,
/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml
is the right location of this file.
- What section in the config file do these changes go? I have a top level section,
[nvidia-container-cli]
, and[nvidia-container-runtime]
You need to add those lines as global params.
disable-require = false
accept-nvidia-visible-devices-envvar-when-unprivileged = false
accept-nvidia-visible-devices-as-volume-mounts = true
[nvidia-container-cli]
environment = []
ldconfig = "@/run/nvidia/driver/sbin/ldconfig.real"
load-kmods = true
path = "/usr/local/nvidia/toolkit/nvidia-container-cli"
root = "/run/nvidia/driver"
[nvidia-container-runtime]
- How can I make these persist? Every time I restart k3s the file content gets reverted.
I think this was because they were not added as global params.
I'm still running into issues.
Steps to reproduce
- Install ubuntu server 20.04
- Install docker
curl https://get.docker.com | sh \
&& sudo systemctl --now enable docker
- Blacklist nouveau
cat <<EOF | sudo tee /etc/modprobe.d/blacklist-nvidia-nouveau.conf
blacklist nouveau
options nouveau modeset=0
EOF
- Disable apparmor sudo apt remove --assume-yes --purge apparmor
- install k3s with
--docker
flag -
helm install --version 1.9.0 --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator --set devicePlugin.env[0].name=DEVICE_LIST_STRATEGY --set devicePlugin.env[0].value="volume-mounts"
- Add globally to
/usr/local/nvidia/toolkit/.config/nvidia-container-runtime/config.toml
accept-nvidia-visible-devices-envvar-when-unprivileged = false
accept-nvidia-visible-devices-as-volume-mounts = true
- Reboot
Result
nvidia-device-plugin-validator
is giving an error and refusing to start:
Error: failed to start container "plugin-validation": Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli.real: device error: /var/run/nvidia-container-devices: unknown device: unknown