sysbox
sysbox copied to clipboard
[Bug] Unable to run container with --runtime=sysbox-runc when Docker data root is on an LVM
After installing Sysbox on Ubuntu focal either from package release or from source code, I cannot run a container with sysbox-runc, I always have the same error:
$ docker run --runtime=sysbox-runc hello-world
docker: Error response from daemon: OCI runtime create failed: container_linux.go:364: starting container process caused "process_linux.go:342: getting the final child's pid from pipe caused \"EOF\"": unknown.
ERRO[0000] error waiting for container: context canceled
whereas with runc it works.
I tried this without success.
$ uname -a
Linux charles 5.4.0-58-generic #64-Ubuntu SMP Wed Dec 9 08:16:25 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.4.2-docker)
Server:
Containers: 2
Running: 0
Paused: 0
Stopped: 2
Images: 206
Server Version: 20.10.0
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc sysbox-runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.0-58-generic
Operating System: Ubuntu 20.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 12
Total Memory: 31.01GiB
Name: charles
ID: LREV:LW7X:THPL:CJY4:W7V2:3PDY:OQ4R:EPPX:QA5F:3BTJ:JKTI:7C6W
Docker Root Dir: /home/clement/encrypted/system/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Default Address Pools:
Base: 172.25.0.0/16, Size: 24
WARNING: No swap limit support
WARNING: No blkio weight support
WARNING: No blkio weight_device support
$ sudo cat /etc/docker/daemon.json
{
"data-root": "/home/clement/encrypted/system/docker",
"bip": "172.20.0.1/16",
"runtimes": {
"sysbox-runc": {
"path": "/usr/local/sbin/sysbox-runc"
}
},
"default-address-pools": [
{
"base": "172.25.0.0/16",
"size": 24
}
]
}
Hi @cprevosteau , thanks for giving Sysbox a shot!
-
Can you double check the sysbox-mgr and sysbox-fs daemons are running? (e.g.,
ps -fu root | grep sysbox) -
Can you double check the
shiftfsmodule is present in the kernel? (lsmod | grep shiftfs)
For example, in my host these result in:
cesar@focal:~/nestybox/sysbox$ ps -fu root | grep sysbox
root 3137321 1 0 Dec16 pts/0 00:00:08 sysbox-mgr --log /var/log/sysbox-mgr.log
root 3137339 1 0 Dec16 pts/0 00:09:12 sysbox-fs --log /var/log/sysbox-fs.log
cesar@focal:~/nestybox/sysbox$ lsmod | grep shiftfs
shiftfs 28672 0
And thus docker run --runtime=sysbox-runc hello-world works without problem:
cesar@focal:~/nestybox/sysbox$ docker run --runtime=sysbox-runc hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
Thanks!
Hi, thx for the quick answer ! Unfortunately nothing there : 1)
$ ps -fu root | grep sysbox
root 177555 1690 0 19:36 pts/5 00:00:00 sysbox-mgr --log /var/log/sysbox-mgr.log
root 177580 1690 0 19:36 pts/5 00:00:06 sysbox-fs --log /var/log/sysbox-fs.log
$ lsmod | grep shiftfs
shiftfs 28672 0
Would you mind joining the sysbox slack channel? It's easier to debug that way. We can post the resolution back into this GitHub issue once we find the problem.
The link to the slack channel is here:
Link is here at the bottom of this page: https://github.com/nestybox/sysbox#contact
Thanks!
After debugging this with @cprevosteau (thanks!), we found out that the problem was that the Docker data-root (which is typically at /var/lib/docker on ext4) was configured to a different directory located on top of an LVM.
As far as I know there is nothing wrong with having the Docker data-root on top of an LVM, but for some reason (yet to be investigated) Sysbox is failing when creating the container in this case.
More specifically, when the docker data-root is on top of an LVM, the container's root filesystem is also on top of that LVM, and this causes sysbox-runc to fail very early when creating the container's init process. Unfortunately sysbox-runc is not providing much info on why the failure occurs. We only see getting the final child's pid from pipe caused , meaning that the container's init process died for some reason very early after it was created.
Interestingly, we also noticed that Docker itself does not like the data-root on an LVM when configured with docker userns-remap. That is, if the Docker data-root is on top of an LVM and "userns-remap": "<some-user>" is added to the /etc/docker/daemon.json file, restarting Docker fails. If the data-root is moved to an ext4 physical partition, restarting Docker works without problem. Thus, there is some incompatibility in Docker itself between userns-remap and LVM.