Create `render` group for Ubuntu >= 20, as per ROCm documentation
Initial issue
As stated in https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation_new.html#setting-permissions-for-groups, for Ubuntu 20 and above, the user needs to be part of the render group.
Therefore, we need to create the render group in the docker image. The following would work:
RUN groupadd render
We might also want to update the documentation because the docker run command should contain --group-add render for Ubuntu 20 and above.
Update - 10th June 2022
I made the following experiments. The user I'm logged in on the host is part of the render group. My user ID is 1002.
-
docker run --rm --device=/dev/kfd rocm/dev-ubuntu-20.04:5.1 rocminfoworks because it runs as
root(with user ID 0 on the host) andll /dev/kfd crw-rw---- 1 root render 510, 0 Jun 9 04:11 /dev/kfd -
docker run --rm --user=1002 --device=/dev/kfd rocm/dev-ubuntu-20.04:5.1 rocminfowill not work with
Unable to open /dev/kfd read-write: Permission denied. -
docker run --rm --user=1002 --group-add render --device=/dev/kfd rocm/dev-ubuntu-20.04:5.1 rocminfowill not work because inside of
rocm/dev-ubuntu-20.04:5.1there is no render group. -
docker run --rm --user=1002 --group-add $(getent group render | cut -d':' -f 3) --device=/dev/kfd rocm/dev-ubuntu-20.04:5.1 rocminfowill work again.
Therefore, I see 2 ways of fixing this.
- Add a render group in the Docker image with ID 109 by default. This would be a "build time" fix and would break as soon as the host render group ID is not 109. The group ID could be passed as an argument of the build (
ARG) but the image would not be portable.FROM rocm/dev-ubuntu-20.04:5.1 RUN groupadd -g 109 render && useradd -g 109 -ms /bin/bash newuser USER newuser - The "run time" fix is to use the
--group-add $(getent group render | cut -d':' -f 3).
Had a similar issue when I was building a Docker image with ROCm support.
The Problem
A non-root user can't access the GPU resources and has to run commands as sudo for GPU access.
Groups
A user inside the docker container has to be a member of the video and render groups to access the GPU without sudo
-
The
videogroup exists by default on Debian systems and has the fixed id of44, so there's no need to do anything as long as the group on the host system and inside the container have the same name and id. -
The
rendergroup, on the other hand, is created by theamdgpu-installscript on the host system and the id gets randomly assigned, for example it can be one of the following:104,109or110
Solution
Using Docker ENTRYPOINT to dynamically create and assign the render group with the host system render group id.
Bash Script
Create an entrypoint.sh script, and add it during the build to the image.
The script will create the render group with the host's group id and assign the user to the video and render groups.
#!/bin/bash
sudo groupadd --gid $RENDER_GID render
sudo usermod -aG render $USERNAME
sudo usermod -aG video $USERNAME
exec "$@"
Dockerfile
Inside the Dockerfile we create a new user and copy the entrypoint.sh script to the image. A basic example:
FROM ubuntu
ENV USERNAME=rocm-user
ARG USER_UID=1000
ARG USER_GID=$USER_UID
RUN groupadd --gid $USER_GID $USERNAME \
&& useradd --uid $USER_UID --gid $USER_GID -m $USERNAME \
&& echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
&& chmod 0440 /etc/sudoers.d/$USERNAME
COPY entrypoint.sh /tmp
RUN chmod 777 /tmp/entrypoint.sh
USER $USERNAME
ENTRYPOINT ["/tmp/entrypoint.sh"]
CMD ["/bin/bash"]
docker build -t rocm-image .
Terminal
When starting the container pass the RENDER_GID environment variable. Let's assume the Docker image is called rocm-image.
export RENDER_GID=$(getent group render | cut -d: -f3) && docker run -it --device=/dev/kfd --device=/dev/dri -e RENDER_GID --group-add $RENDER_GID rocm-image /bin/bash
VS Code Devcontainer
Just add the following code to .devcontainer/devcontainer.json file and you're good to go. A VS Code devcontainer with GPU access.
{
"build": { "dockerfile": "./Dockerfile" }
"overrideCommand": false,
"initializeCommand": "echo \"RENDER_GID=$(getent group render | cut -d: -f3)\" > .devcontainer/devcontainer.env",
"containerEnv": { "HSA_OVERRIDE_GFX_VERSION": "10.3.0" },
"runArgs": [
"--env-file=.devcontainer/devcontainer.env",
"--device=/dev/kfd",
"--device=/dev/dri"
]
}
On one of our machines GID of render group on host overlapped with ssh group in the image, so groupadd from the init script failed. It's best to replace use the group id in the following usermod to still get acceptable result in such a scenario.
Hi @romintomasetti @sergejcodes, thank you for both reporting this issue and providing a detailed solution to the problem. This has been addressed in our newer images by defaulting to a root user in order to maintain access to GPU resources. Please let me know if we can close out this issue.
@harkgill-amd in many cases, clusters (Kubernetes) have security policies that prevent containers running as root, this limitation will prevent MANY companies from being able to use AMD GPUs for their AI workloads.
In Kubernetes, this is likely something that your https://github.com/ROCm/k8s-device-plugin can resolve by checking the host's render group and adding it as a supplementalGroups in the Pod's securityContext, but its problematic if the cluster has multiple nodes which don't have the same GID for their render group.
However, I feel there must be a more clean solution, because running Nvidia GPUs have no such problems either on local docker or their Kubernetes device plugin. I would check what they are doing, but it might be something like they have every device mount owned by a constant GID (e.g. 0, or something the user configures) and ensure the docker container run as users who have this group.
Here is the related issue on the AMD Device Plugin repo: https://github.com/ROCm/k8s-device-plugin/issues/39
Also, for context, when using Nvidia GPUs, you don't mount them with the --device parameter, but instead use the --gpus parameter, so perhaps this is part of their workaround:
- https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/docker-specialized.html
For reference, here is information about the --device arg of docker run, perhaps there we need to explicitly allow read write with the :rwm suffix (which is the default), or set something on --device-cgroup-rule:
Although, I guess the real question is why AMD ever thought it was a good idea to not have a static GID for the render group. Perhaps the solution is to deprecate the render group and always use video or make a new group.
Hi @harkgill-amd, I don't think this should be closed as the inherent problem with using a non-root user is still prevalent, and there isn't a clean solution for this.
@thesuperzapper and @gigabyte132, thank you for the feedback. We are currently exploring the possibility of using udev rules to access GPU resources in place of render groups. The steps would be the following
- Create a new file
/etc/udev/rules.d/70-amdgpu.ruleswith the following content:
KERNEL=="kfd", MODE="0666"
SUBSYSTEM=="drm", KERNEL=="renderD*", MODE="0666"
- Reload the
udevrules with:
sudo udevadm control --reload-rules && sudo udevadm trigger
This configuration grants users read and write access to AMD GPU resources. From there, you can pass access to these devices into a container by specifying --device /dev/kfd --device /dev/dri in your docker run command. To restrict access to a subset of GPUs, please see the following documentation.
I ran this setup with the rocm/rocm-terminal image and am able to access GPU resources without any render group mapping or root privileges. Could you please give this a try on your end and let me know what you think?
@harkgill-amd while changing the permissions on the host might work, I will note that this does not seem to be required for Nvidia GPUs.
I imagine that this is because they mount the device paths specifically because /dev/dri is not the path of the actual device, so docker's --device mount (which claims to give the container read/write permissions) does not correctly change its permissions.
Because specifying each device is obviously a pain for end users, they added a custom --gpus feature (also see these docs) which requires users to install the nvidia-container-toolkit.
Also want to highlight the differences between the Kubernetes Device Plugin for AMD/Nvidia, as this is where most people are using lots of GPUs, and the permission issues also occur on AMD but not Nvidia:
- Nvidia Device Plugin:
- https://github.com/NVIDIA/k8s-device-plugin/blob/v0.16.2/internal/plugin/server.go#L319-L333
- AMD Device Plugin:
- https://github.com/ROCm/k8s-device-plugin/blob/v1.25.2.8/cmd/k8s-device-plugin/main.go#L207-L240
@harkgill-amd after a lot of testing, it seems like the major container runtimes (including docker and containerd) don't actually change the permissions of devices mounted with --device like they claim to.
For example, you would expect the following command to mount /dev/dri/card1 with everybody having rw, but it does not:
docker run --device /dev/dri/card1 ubuntu ls -la /dev/dri
# OUTPUT:
# total 0
# drwxr-xr-x 2 root root 60 Oct 24 18:52 .
# drwxr-xr-x 6 root root 360 Oct 24 18:52 ..
# crw-rw---- 1 root 110 226, 1 Oct 24 18:52 card1
This is also seemingly happens on Kubernetes despite the AMD Device plugin requesting that the container be given rw on the device.
@harkgill-amd We need to find a generic solution which allows a non-root container to be run on any server (with a default install of AMD drivers)
This problematic because there is no standard GID for the render group, and the container runtimes don't respect requests to change the permissions of mounted devices.
Note, it seems like ubuntu has a default udev rule under /usr/lib/udev/rules.d/50-udev-default.rules which makes render the owner of /dev/dri/renderD* and video the owner of everything else in /dev/dri/.
Possible solutions
-
Give everyone read/write
/dev/dri/renderD*on the host (like you proposed above):- PROBLEM: Some users aren't going to want to make all their
/dev/dri/renderD*devices have0666permissions.
- PROBLEM: Some users aren't going to want to make all their
-
Create a new standard GID to add as an owner of
/dev/dri/renderD*(or usevideo=44). -
Do what Nvidia does, and don't mount anything under
/dev/dri/in the container, and instead mount something like the/dev/nvidia0devices which havecrw-rw-rw-and seemingly are how CUDA apps interact with the GPUs. -
Mount the devices as bind volumes rather than as actual devices:
- PROBLEM: would not work in Kubernetes, because the device plugin requires a list of device mounts be returned for a container that requests an
amd.com/gpu: 1limit, not volumes.
- PROBLEM: would not work in Kubernetes, because the device plugin requires a list of device mounts be returned for a container that requests an
-
Automatically add the detected GID of the
rendergroup to the user as the container starts (because we don't know what the GID is before we start running on a specific server):- PROBLEM: this would require the non-root container user to be able to edit
/etc/groupwhich would obviously allow root escalation
- PROBLEM: this would require the non-root container user to be able to edit
-
Figure out why all the container runtimes are not respecting the request to change file permissions on device mounts.
@thesuperzapper thank you for outlining the possible solutions! Let me add some comments regarding some of them.
- Indeed not all users would like to make their devices have 0666 permissions but at the moment it is an official recommendation from AMD, see: ROCm docs
- The groups are managed by the OS and changing group ownership or introducing new group will not be persistent. So it is not recommended to set it on the host. Doing it in the running container though is ok.
- n/a
- n/a
- When
docker runcommand is available, using--group-addis a good option. Whendocker runis not available (e.g. in Kubernetes) it can be done using anENTRYPOINTwithout granting container user full root permissions, similar to what @sergejcodes suggested:
- In Dockerfile, grant permissions to the user to use certain utilities:
RUN chmod u+s /usr/bin/chgrp /usr/bin/chmod - Call entrypoint in the end:
ENTRYPOINT ["/tmp/entrypoint.sh"] - Change group ownership in the entrypoint and revoke the permissions of the user:
chgrp video /dev/dri/renderD*
chmod u-s /usr/bin/chgrp /usr/bin/chmod
This approach is not recommended as well because the entrypoint can be easily overriden and the user would then have root access to very powerful utilities such as chmod. I provide it only for completeness.
For Kubernetes, the better approach would be to update nodes as described above (point 1). Setting the group as SupplementalGroups works but setting a node-specific group id dynamically in a reliable way and without significant increase in complexity is not possible at the moment.
- In my opinion the runtimes are not broken, at least not Docker. Docker runtime sets cgroup permissions, not file permissions to the devices. The resulting permissions are the “minimum” of the two. So if cgroup permission is
rwbut file permissions arerw-rw----, then file permissions will still prevail because they are more tight.