docker exec as user not working with sysbox-runc on Ubuntu 23.04 (Lunar)
I've hit an issue where I'm unable to run docker exec --user for sysbox-runc containers on Ubuntu 23.04. For any user which doesn't exist in the image (but does in the container) I get the following error:
unable to find user [...]: no matching entries in passwd file
The same issue doesn't happen when using the default (runc) container runtime.
My best (semi-informed) guess is that it's somehow looking at the wrong filesystem layer and only seeing the /etc/passwd from the image.
I've written a script to reproduce the error but the manual steps are:
docker run --runtime=sysbox-runc --detach --rm --name "jammy" -t "ubuntu:jammy"
docker exec -ti "jammy" useradd execuser
docker exec -ti "jammy" cat /etc/passwd
docker exec --user "execuser" -ti "jammy" id
docker stop jammy
Environment lsb_release:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 23.04
Release: 23.04
Codename: lunar
uname:
Linux lunar-amd64 6.2.0-20-generic #20-Ubuntu SMP PREEMPT_DYNAMIC Thu Apr 6 07:48:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
docker:
Docker version 23.0.6, build ef23cbc
sysbox:
sysbox-mgr
edition: Community Edition (CE)
version: 0.6.1
commit: ba99c0e7088f1e1ab51f95551f50de9524176655
built at: Sat Apr 8 06:08:57 UTC 2023
built by: Rodny Molina
Can confirm this can still be reproduced with v0.6.5
Hi @saldrich-adx, thanks for reporting and very sorry that we dropped the ball on answering (you filed the original issue back in May 2023 and we never responded!).
I was easily able to reproduce with the steps you provided above, using Sysbox v0.6.5. There's definitely a bug somewhere, I'll investigate and get back.
I also confirmed that Docker with the default runc does NOT reproduce the problem.
I investigated and the error unable to find user [...]: no matching entries in passwd file is actually coming from Docker engine (rather than Sysbox).
During the docker exec --user "execuser" ..., Docker engine is looking for user execuser in the container's /etc/passwd file. However it does not find it. Yet the container can see it. Why?
The reason has to do with the way Sysbox sets up the container's rootfs when using idmapped mounts (supported in kernel 5.19+). When the container starts up, Docker engine typically sets up the container's rootfs using overlayfs. Sysbox performs ID-mapping of that rootfs by dispatching an agent that enters the container mount namespace, unmounts overlayfs, idmaps the lower layers, and remounts overlayfs. This re-mount of the rootfs is only seen within the container's mount-namespace, not by the host.
This explains why the container can add a new entry (e.g., execuser) to the /etc/passwd file and see it, yet Docker engine can't: the container sees a new overlayfs mount (in it's mount namespace), and Docker sees the original overlayfs mount (in the host's mount-ns).
I am not sure how to solve it, other than changing Docker engine to first enter the container's mount-namespace when looking up files inside the container.
Of course, an obvious work-around is to create the Docker image with all users predefined in /etc/passwd during the image build. However ideally what @saldrich-adx was trying should work.
Hm... yes, that's an interesting one; the flag was added when docker exec (docker container exec) did not yet exist, so the lookup would happen during docker create (docker container create). In that situation the /etc/passwd would either exist in the image or in some cases I know users mounted a /etc/passwd into it, but there would not have been a situation where a user was created at runtime. I went looking for what we documented, but it looks like we document "must be present in the container", but this may have been written from that assumption (no ambiguity between "image" or "container" as the container would not be mutated before the lookup happened) https://docs.docker.com/engine/containers/run/#user
Thinking of this though, I recall some things that may be related to this;
I recall that we have some code-path where docker uses either /etc/passwd, or falls back to getent for looking up a user; the latter was for situations where alternative authentication methods are used;
https://github.com/moby/moby/blob/b1fc766e48a2f6dd053256efe481b583645b7da3/pkg/idtools/idtools_unix.go#L93-L101
However, I'd have to dig deep to check if that codepath is hit in all situations; when using the containerd-snapshotter, this lookup is delegated to containerd; https://github.com/moby/moby/blob/b1fc766e48a2f6dd053256efe481b583645b7da3/daemon/exec_linux.go#L45-L59
The second thing I recall related to this, was a discussion in runc about whether lookup of the user is a responsibility of runc or the runtime using runc. Originally lookup happened in runc (docker would pass on username:groupname or uid:gid, and runc would lookup the user if names were used instead of IDs), however the OCI runtime spec omitted this part, and only described UID/GID, later on adding that it's the runtime's responsibility to resolve (https://github.com/opencontainers/runtime-spec/issues/38, https://github.com/opencontainers/runtime-spec/issues/38); only accepting UID/GID would break compatibility with existing implementations, so lookup was (for the time being) kept in runc, but other implementations did not implement this, so it's possible (but I'd really have to check) we pre-resolve the UID/GID for this reason.
- https://github.com/opencontainers/runc/issues/3998
Thanks for investigating, sounds pretty gnarly. As an aside, I've been working around it by execing su to switch to the added user and running the command, e.g.:
docker exec -ti "jammy" su - execuser -c id
It would be great if exec did work directly though :slightly_smiling_face:
Thanks! Yes, this may require some digging (through multiple codebases); it's one of the prime examples where things were relatively easy at the time the functionality was introduced 10 Years ago ("just look in /etc/passwd and start the container" - what could ever get more complicated than that ???), but things got much more involved over time.
I've been working around it by execing su to switch to the added user and running the command
Depending on your case, it may also be worth looking into gosu. su (and sudo) and namespaces / containers aren't always friends, and may have some odd behaviours (sometimes very subtle, which can make it hard to discover); gosu is written using the same Go libraries as are used by Docker and runc to perform these tasks, and the README on the repository describes some of the "fun facts" related to su; https://github.com/tianon/gosu