sysbox
sysbox copied to clipboard
Docker build inside sysbox container results in "lchown ... no such file or directory" errors
Hi there,
I am attempting to solve our CI/CD woes using sysbox and I was really excited to have it working, until it didn't.
Using a dotnet restore with dotnetcore image inside a docker build is failing with a very generic message:
Error processing tar file(exit status 1): lchown /tmp/clr-debug-pipe-202-24216845-in: no such file or directory
see here and here for more information
I am able to solve this by adding COMPlus_EnableDiagnostics=0 as an ENV in the Dockerfile or by passing it from docker-compose and using ARG in Dockerfile. However, I really don't want to have to alter a ton of Dockerfiles for a bunch of microservices, and I don't want to have to disable debugging, which is what that flag does.
How to reproduce: create a Dockerfile using mcr.microsoft.com/dotnet/core/sdk:3.1-buster image and then either pull a dotnet repo that does a dotnet restore in the Dockerfile
Things I have tried:
-
running on normal Docker/non-dind = works as intended
-
running on dind using privileged flag and mounting /lib/var/docker as a volume and running nested = works
-
running with sysbox as runtime and:
- added cap_add - ALL to first docker-compose = fails
- added cap_add - ALL to inner docker-compose = fails
I was able to do an strace on both the docker daemon when using standard docker and then using dind with sysbox, here are a few snippets
standard:
-mknodat(AT_FDCWD, "/tmp/clr-debug-pipe-78-63381236-in", S_IFIFO|0700) = 0
-fchownat(AT_FDCWD, "/tmp/clr-debug-pipe-78-63381236-in", 0, 0, AT_SYMLINK_NOFOLLOW) = 0
-fchmodat(AT_FDCWD, "/tmp/clr-debug-pipe-78-63381236-in", 0700) = 0
-utimensat(AT_FDCWD, "/tmp/clr-debug-pipe-78-63381236-in", [{tv_sec=1610522665, tv_nsec=0} /* 2021-01-13T07:24:25+0000 */, {tv_sec=1610522665, tv_nsec=0} /* 2021-01-13T07:24:25+0000 */], 0) = 0
sysbox:
-newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/ca60dd45565e9b2b10754f95f7058ff401485ccca62253f28a522f105018c9b2-init/merged/tmp/clr-debug-pipe-225-63286646-in", 0xc00192e6b8, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory) -newfstatat(AT_FDCWD,
"/var/lib/docker/overlay2/ca60dd45565e9b2b10754f95f7058ff401485ccca62253f28a522f105018c9b2/merged/tmp/clr-debug-pipe-225-63286646-in", {st_mode=S_IFIFO|0700, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
-lgetxattr("/var/lib/docker/overlay2/ca60dd45565e9b2b10754f95f7058ff401485ccca62253f28a522f105018c9b2/merged/tmp/clr-debug-pipe-225-63286646-in", "security.capability", 0xc00192a700, 128) = -1 ENODATA (No data available)
-newfstatat(AT_FDCWD, "/var/lib/docker/overlay2/ca60dd45565e9b2b10754f95f7058ff401485ccca62253f28a522f105018c9b2-init/merged/tmp/clr-debug-pipe-225-63286646-out", 0xc00192e858, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
As you can see, it doesn't seem to be able to run any of the syscalls like mknodat, fchmodat, etc. Which is why I was hoping adding the cap_add would solve this. Both containers are running as root.
Any help on this would be much appreciated!
Hi @DPatrickBoyd, thanks for giving Sysbox a shot and for filing the issue. Thanks for the initial debugging on it too!
Error processing tar file(exit status 1): lchown /tmp/clr-debug-pipe-202-24216845-in: no such file or directory
In the past I've seen this error when using an older container image with a new version of Docker. In fact a couple of weeks someone in the Sysbox slack channel reported this same issue and solve it by bumping the Docker version inside the container.
Inside the container, can you do?
lsb_release -a
uname -a
docker version
Thanks!
hi @ctalledo ! thanks for getting back to me so fast. which container? the "outter" or "inner" container?
This fails with my image using both Bionic and Focal
Currently:
Distributor ID: Ubuntu Description: Ubuntu 18.04.5 LTS Release: 18.04 Codename: bionic
Docker version is latest Client: Docker Engine - Community Version: 20.10.2 API version: 1.41 Go version: go1.13.15 Git commit: 2291f61 Built: Mon Dec 28 16:17:32 2020 OS/Arch: linux/amd64 Context: default Experimental: true
Server: Docker Engine - Community Engine: Version: 20.10.2 API version: 1.41 (minimum version 1.12) Go version: go1.13.15 Git commit: 8891c58 Built: Mon Dec 28 16:15:09 2020 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.4.3 GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b runc: Version: 1.0.0-rc92 GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff docker-init: Version: 0.19.0 GitCommit: de40ad0
hi @ctalledo ! thanks for getting back to me so fast. which container? the "outter" or "inner" container?
I meant the outer container (the one launched with Docker + Sysbox). I should have been more explicit since we can get confused quickly :)
If I understand correctly, the failure is occurring when doing a Docker build inside the outer container correct?
ah interesting, on inspecting the resulting docker image that the base image dotnetcore came from it was created using docker version 19. Not sure how that is possible it reports as "DockerVersion": "19.03.13+azure",
edit: checking out the other images that are in cache, they seem to be using docker 20.10, so its just the base image apparently
hi @ctalledo ! thanks for getting back to me so fast. which container? the "outter" or "inner" container?
I meant the outer container (the one launched with Docker + Sysbox). I should have been more explicit since we can get confused quickly :)
If I understand correctly, the failure is occurring when doing a Docker build inside the outer container correct?
yes correct. Both my host vm and the sysbox container are all running 20.10
@DPatrickBoyd:
To help me reproduce:
-
What command are you using to launch the outer container (Docker + Sysbox).
-
Can you share the Dockerfile for the inner container you are trying to build?
Thanks!
So I am actually using a docker-compose file for this, and its a little convoluted so I will try and make it not convoluted since it involves company code and get back to you soon thanks
So I am actually using a docker-compose file for this, and its a little convoluted so I will try and make it not convoluted since it involves company code and get back to you soon thanks
Sure; whatever you can share that allows me to repro would be great. Thanks!
Hi @DPatrickBoyd,
I tried reproducing with a sysbox container using nestybox/ubuntu-bionic-systemd-docker, with the inner docker at version 19.03 or version 20.10, and was not able to reproduce. That is, inside that sysbox container I easily builda Dockerfile that looks like this:
FROM mcr.microsoft.com/dotnet/core/sdk:3.1-buster
RUN apt-get update && apt-get install -y nano
I also tried reproducing with a sysbox container usning nestybox/ubuntu-focal-systemd-docker, with the inner docker at version 19.03, and no problem there either.
I suspect the Dockerfile I used is too simple, so if you could provide more info on the Dockerfile that is causing the failure it would be useful.
ok I was able to replicate with general files here is a random dotnet application I found https://github.com/dotnet-architecture/eShopOnWeb.git
sysbox was brought up with:
sudo docker run --runtime=sysbox-runc --rm -it --hostname my_cont ubuntu:latest bash
I then used docker install manually, and started the docker daemon with system docker start
I then used
sudo docker exec -it $containerid bash
to get inside of it
- just run inside of sysbox:
git clone https://github.com/dotnet-architecture/eShopOnWeb.gitcd eShopOnWebdocker-compose build
Thanks ... let me give it a shot and get back to you in a bit.
Hi @DPatrickBoyd:
I followed the steps but was not able to repro. Here is what I did:
- Launched the sysbox container:
docker run --runtime=sysbox-runc --rm -it --hostname my_cont ubuntu:latest bash
The remainder of the commands occur inside the container.
- Verified the container has Ubuntu Focal in it:
# cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
- Installed and started Docker:
# apt-get update && apt install docker.io
# dockerd > /var/log/dockerd.log 2>&1 &
NOTE: this installed docker v19.03.8.
- Installed Docker compose:
# apt-get install curl
# curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
# chmod +x /usr/local/bin/docker-compose
# ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
NOTE: this installed docker-compose 1.27.4
- Cloned the eShopOnWeb repo and built it:
# git clone https://github.com/dotnet-architecture/eShopOnWeb.git
# cd eShopOnWeb/
# docker-compose build
This worked without a problem.
I suspect the Docker versions I used are different than what you used.
Can you confirm the versions of Docker and Docker-compose you had inside the container?
My versions are all 20+ for docker. Doubt docker-compose is a deal breaker. I will downgrade to docker v19 and report back. Can you try with version 20 and see if it fails for you?
From: Cesar Talledo [email protected] Sent: Wednesday, January 13, 2021 9:00:40 PM To: nestybox/sysbox [email protected] Cc: DPatrickBoyd [email protected]; Mention [email protected] Subject: Re: [nestybox/sysbox] dotnet restores not working inside of docker-in-docker using sysbox as runtime (#187)
Hi @DPatrickBoydhttps://github.com/DPatrickBoyd:
I followed the steps but was not able to repro. Here is what I did:
- Launched the sysbox container:
docker run --runtime=sysbox-runc --rm -it --hostname my_cont ubuntu:latest bash
The remainder of the commands occur inside the container.
- Verified the container has Ubuntu Focal in it:
cat /etc/os-release
NAME="Ubuntu" VERSION="20.04.1 LTS (Focal Fossa)"
- Installed Docker:
apt-get update && apt install docker.io
dockerd > /var/log/dockerd.log 2>&1 &
NOTE: this installed docker v19.03.8.
- Installed Docker compose:
apt-get install curl
curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
NOTE: this installed docker-compose 1.27.4
- Cloned the eShopOnWeb repo and built it:
git clone https://github.com/dotnet-architecture/eShopOnWeb.git
cd eShopOnWeb/
docker-compose build
This worked without a problem.
I suspect the Docker versions I used are different than what you used.
Can you confirm the versions of Docker and Docker-compose you had inside the container?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/nestybox/sysbox/issues/187#issuecomment-759926815, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJLICL4A4YETA7RC7JOC343SZZ27RANCNFSM4WBGUMCA.
My versions are all 20+ for docker. Doubt docker-compose is a deal breaker. I will downgrade to docker v19 and report back. Can you try with version 20 and see if it fails for you?
Sounds good, let's do that. Thanks!
ok, so reporting back. I was able to get it to work by downgrading, but I had to manually download and install docker-ce, docker-cli and containerd.io .deb files in the dockerfile because using docker.io was breaking init.d.
I was able to successfully build using sysbox, so thank you!
My only concern now at the moment is how changing runtimes from runc to sysbox could potentially effect the downstream (ie production) environment. How different is sysbox and what sort of things could it effect?
Does it alter the resulting image or information in anyway that is unique or propietary?
Hi @DPatrickBoyd,
I was able to get it to work by downgrading
Got it. I still want to get to the bottom of why the downgrade is needed, so will take a closer look.
but I had to manually download and install docker-ce, docker-cli and containerd.io .deb files in the dockerfile because using docker.io was breaking init.d.
I see; that's strange. In my case the outer container had systemd in it, and installing docker.io worked perfectly.
My only concern now at the moment is how changing runtimes from runc to sysbox could potentially effect the downstream (ie production) environment. How different is sysbox and what sort of things could it effect?
Sysbox can live side-by-side with the OCI runc, so it's not an "either" choice.
As you are just getting familiarized with Sysbox, my suggestion is that you use Sysbox for containers that run workloads that otherwise require privileged containers with the OCI runc. Things like Docker-in-Docker, systemd-in-docker, or even k8s-in-Docker. This way you avoid the security risks posed by privileged containers. It's great for CI/CD, container-based dev environments, sandboxing, etc.
Having said this, we strive to make Sysbox a superset of the OCI runc, meaning that Sysbox should be capable of running any workloads that run in containers with the OCI runc, but do so more securely. This is the case already for most workloads, though there are still a few issues.
Does it alter the resulting image or information in anyway that is unique or propietary?
No. Sysbox places no requirements on the container image. Rather, it works by enhancing the container abstraction, such that processes running inside the container see an environment that resembles that of a VM or physical machine (though it's really a container).
Hope that helps!
yes it does help thank you :)
One thing that did happen was that at some point my containers got restarted, and the docker daemon inside of the sysbox container couldn't start up again, it still had a containerd process running somehow and the .pid file was still there for docker. Not sure if there is something I can do, or if there is a better way to handle ungracefully shutdown containres? I can make a new issue for this or we can take it into slack
One thing that did happen was that at some point my containers got restarted, and the docker daemon inside of the sysbox container couldn't start up again, it still had a containerd process running somehow and the .pid file was still there for docker
A good way to deal with that is to add a process manager to the container (e.g., systemd, supervisord, etc), such that when the container starts it can automatically start the processes / services you want it to. It also has the advantage of handling process reaping / reparenting for your container.
Systemd is a bit heavy but Docker integrates well with it, so it will be able to restart the Docker service and automatically remove the .pid file.
You can find examples of Dockerfiles that add systemd or supervisord to the container here:
https://github.com/nestybox/dockerfiles
using this process gives me the following error:
Couldn't find an alternative telinit implementation to spawn.
on further inspection, I am using the init flag for starting containers, which uses docker-init (tini) for PID 1, and I believe that is interfering with systemd starting up as it wants to be PID 1
on further inspection, I am using the init flag for starting containers, which uses docker-init (tini) for PID 1, and I believe that is interfering with systemd starting up as it wants to be PID 1
Got it, thanks.
We spotted a lchown error in our gitlab-ci infrastructure too which happens when pulling an oracle image within a docker dind container:
[ERROR] DOCKER> Unable to pull '[...]oracle-xe-11-2-0-2:RELEASE-1.0.1' from registry '[...]' : failed to register layer: Error processing tar file(exit status 1): lchown /dev/initctl: no such file or directory [failed to register layer: Error processing tar file(exit status 1): lchown /dev/initctl: no such file or directory ]
This happens while pulling an image built from this Dockerfile: https://github.com/oracle/docker-images/tree/main/OracleDatabase/SingleInstance/dockerfiles/11.2.0.2 http://download.xskernel.org/soft/linux-rpm/oracle-xe-11.2.0-1.0.x86_64.rpm.zip
It only happens within a docker dind container running on a host docker-daemon using sysbox-runc. When running docker dind on runc (with --privileged) it works.
This does NOT happen on the host docker-daemon using sysbox-runc so it is not a general sysbox problem.
Hi @nudgegoonies , thanks for the latest report.
Question: what's the version / tag of the docker:dind container image on which this happens? Does it happen with the latter docker:dind?
I ask because in the past we've seen problems with the docker:18.04-dind image, but these don't repro with the 19.04 image.
Thanks!
Hi @ctalledo Thank you for your answer!
We use version 20.10.2 as host docker daemon and as dind.
We use version 20.10.2 as host docker daemon and as dind.
Got it; could you provide the repro steps please?
You have to build a docker image with the Dockerfile.ex and the .zip file linked in my above comment and store it in a registry. Then start a dind with volume:
docker volume create --name docker-dind
docker pull docker:20.10.2-dind
/docker run --name docker-dind -v docker-dind:/var/lib/docker -d docker:20.10.2-dind
Then exec into the docker-dind container and pull the selb built oracle image.
Thanks @nudgegoonies ; I was able to repro following your repro steps, will debug it.
Hi @nudgegoonies, had to dig a bit to get to the bottom of this one, but I think I've found the reason for the problem.
First, I reproduced the problem by launching a sysbox container (with the nestybox/ubuntu-focal-systemd-docker image), and inside of it launching the docker CLI and docker daemon containers as follows:
$ docker network create some-network
$ docker volume create --name docker-dind
$ docker pull docker:20.10.2-dind
# Inner Docker dind container:
$ docker run --privileged --name dind -d -v docker-dind:/var/lib/docker --network some-network --network-alias docker -e DOCKER_TLS_CERTDIR=/certs -v dind-certs-ca:/certs/ca -v dind-certs-client:/certs/client docker:20.10.2-dind
# Inner Docker CLI container:
$ docker run -it --rm --network some-network -e DOCKER_TLS_CERTDIR=/certs -v dind-certs-client:/certs/client:ro docker:latest sh
Then, from the inner Docker CLI container, I pulled the oracle database container image you mentioned above.
The pull failed with:
failed to register layer: Error processing tar file(exit status 1): lchown /dev/initctl: no such file or directory
I then straced the docker pull operation, I found that the failure occurs in the fchownat() syscall below:
2407193 newfstatat(AT_FDCWD, "/dev/initctl", 0xc000a6eac8, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
2407193 fchownat(AT_FDCWD, "/dev/initctl", 0, 0, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
2407193 write(2, "lchown /dev/initctl: no such fil"..., 46) = 46
Basically, it looks like this image requires /dev/initctl; as a result, Docker is looking for /dev/initctl during the image extraction but this device does not exist within the ephemeral docker container (spawned inside the dind container) where the extraction is taking place.
I then repeated the experiment by running the same commands above, but this time at host level (i.e., not inside the sysbox container). Interestingly, this time things worked. I straced the docker daemon, I found the following:
2496154 newfstatat(AT_FDCWD, "/dev/initctl", <unfinished ...>
2496154 <... newfstatat resumed>0xc000a3cc68, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
2496154 mknodat(AT_FDCWD, "/dev/initctl", S_IFIFO|0600 <unfinished ...>
2496154 <... mknodat resumed>) = 0
2496154 fchownat(AT_FDCWD, "/dev/initctl", 0, 0, AT_SYMLINK_NOFOLLOW <unfinished ...>
2496154 <... fchownat resumed>) = 0
Notice the difference: docker called mknod on /dev/initctl during the image extraction. As a result, the subsequent fchownat() worked fine.
So why did Docker not call mknod when the dind image run inside the sysbox container, but did call it when the dind image run on the host?
Looking at the Docker code, it appears the answer is here:
186 │ case mode&os.ModeDevice != 0:
187 │ if sys.RunningInUserNS() {
188 │ // cannot create a device if running in user namespace
189 │ return nil
190 │ }
191 │ if err := unix.Mknod(dstPath, stat.Mode, int(stat.Rdev)); err != nil {
192 │ return err
193 │ }
Since Sysbox containers always use the Linux user-namespace (for strong isolation), the Docker daemon running inside the inner dind container is refusing to use mknod to create the /dev/initctl device required by the Oracle image. As a result, the subsequent fchownat() fails.
This explains the failure. It's really caused by Docker's assumption that within a user-ns mknod is not allowed. This is generally true, but does not take into account that container runtimes like Sysbox (or LXD for example) can deal properly with such operations by virtue of intercepting the mknod syscall, examining if it's allowed, and if so handling it on behalf of the container. Thus, it would be better if Docker had called mknod() and if it failed, optionally check if it's running in userns().
As far as a solution, I don't have a good one right now. The only work-around I found was to use the docker:19.03.2-dind image instead of the docker:20.10.2-dind image (which suggests the Docker source code check for userns I copied above must have been recently added).
I'll think if there is some other solution to make this work with docker:20.10.2-dind.
Hi @ctalledo Thank you very much for your detailed explanation! Now this makes sense.
One question comes to my mind as this behavior comes from using userns. Would shiftfs help in this situation? There are already "inofficial" dkms solutions available for running shiftfs kernel module on Debian.