podlike
podlike copied to clipboard
Using "devices" in a component or running the component as privileged
I'm aware that podlike is no longer under active development, however I decided to give it a shot to try to circumvent the limitation that docker swarm does not allow devices to be mounted on services. So I thought I'd ask here about the issue I'm having in case during any of your tests you encountered something similar.
I'm my scenario I'm actually trying to run Emby as a component, to be able to mount /dev/vchiq
which is the VideoCore on Raspberry Pi 4, to allow for hardware encoding/decoding.
So, everything works fine for regular containers, like an nginx, or even an Emby container without devices. However, as soon as I add:
devices:
- /dev/vchiq:/dev/vchiq
I start seeing the following error when attempting to create the container:
Using API version: 1.41
Starting component: emby
Exited: emby Error: Failed to start emby: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: process_linux.go:422: setting cgroup config for procHooks process caused: failed to write "a *:* rwm" to "/sys/fs/cgroup/devices/docker/3f1586043f59a13ba56ebdb46d9b569387019688cf465a5bd8a74111d223753c/86414a1dda9d7898ac5022151245b09edd84cb038c967002c3b21146aeddad50/devices.allow": write /sys/fs/cgroup/devices/docker/3f1586043f59a13ba56ebdb46d9b569387019688cf465a5bd8a74111d223753c/86414a1dda9d7898ac5022151245b09edd84cb038c967002c3b21146aeddad50/devices.allow: operation not permitted: unknown
Stopping container: emby
Failed to stop the container: Error response from daemon: No such container: 86414a1dda9d7898ac5022151245b09edd84cb038c967002c3b21146aeddad50
Failed to remove the container: Error: No such container: 86414a1dda9d7898ac5022151245b09edd84cb038c967002c3b21146aeddad50
The same occurs if instead of devices I attempt to set the component as privileged:
services:
emby:
image: ghcr.io/linuxserver/emby
privileged: true
Now, the odd thing is, I have the Emby component's configuration inside a compose file. And if I manually start that exact same compose using docker-compose up
everything works fine.
Did you encounter anything like this? I can't seem to find what is different between manually running docker-compose or what podlike does when it starts the container.
Hm, interesting, I haven't seen this before I think. I run a couple of things on RPi4 as well, but I don't think I use devices/privileged. If I had to guess, I'd say maybe the API has changed since I wrote podlike and perhaps we'd need to update those dependencies and the API call parameters when creating containers if passing in device settings have changed since.
Yup, indeed very odd. Now, it doesn't seem to be something specific to the component container or image. I can reproduce the same issue with the standard nginx image:
labels:
pod.component.proxy: |
image: nginx:1.13.10
privileged: true
It might be some Raspbian craziness around device permissions, but that would be odd, since a regular docker-compose handles both "privileged" and "devices" without complaints. Real odd.
Alright, so I've been able to reproduce the issue and i believe i've found the culprit :). It seems to be related to the fact that podlike sets the CgroupParent
property of the component container.
I've reproduced it by (completely outside of podlike), from inside a different container with the docker socket mounted, attempt to start a container which applies the same CgroupParent values that podlike does.
Meaning, from that main container with the docker socket, attempt to manually create and start another container (the component) setting the CgroupParent to "/docker/<id-of-parent-container>
" obtained from /proc/self/cgroup
, just as podlike does.
Doing that causes the same failure I see within podlike.
It would appear that doing Cgroup hierarchies that way prohibits the child from being privileged or having access to host devices.
Component Container creation:
curl -XPOST --unix-socket /var/run/docker.sock -d '{"Image":"nginx:1.13.10", "HostConfig": { "Privileged": true, "AutoRemove": true, "CgroupParent": "/docker/<id-of-this-container>" }}' -H 'Content-Type: application/json' http://localhost/containers/create
Component Container start (which causes the failure):
curl -XPOST --unix-socket /var/run/docker.sock -H 'Content-Type: application/json' http://localhost/containers/<id-of-component-container>/start
Ah, excellent findings! Have you tried giving the parent container access to the device, or making it privileged (if access to the device is not enough) ?
Ah, well, that wouldn't work because the parent container would end up being a Swarm Service and privileged swarm services aren't allowed, just as devices aren't allowed.
Actually I ended up implementing an alternative approach, based on the official docker-compose docker image, and some bash scripts, that allow me to kind of do what podlike does, by doing a docker-compose up and down during the service lifecycle. Plus also connects the "components" to the parent containers network for localhost sharing.
The main difference with podlike is that i'm not touching cgroups.
Seems to work nicely for now, and allows me to run privileged containers or have access to hardware devices. It's still very barebones and just testing it locally for now. Will probably share something here on github if it ends up working in a stable manner (meaning stable cleanup of other containers, and that kind of thing).
It's basically running docker-compose side by side with swarm services and allowing them to interact indirectly through the parent container's ports. Something like that.