linuxkit icon indicating copy to clipboard operation
linuxkit copied to clipboard

Docker example does not work

Open the-maldridge opened this issue 4 years ago • 10 comments

Description

The docker example appears broken. The config.json doesn't appear to be included in the build, among other issues.

Steps to reproduce the issue:

  1. Build linuxkit from master. I built from 1df038e1b.
  2. Copy the docker.yml example to a new empty directory.
  3. Build:
./linuxkit build docker.yml 
Extract kernel image: docker.io/linuxkit/kernel:5.10.76
Add init containers:
Process init image: docker.io/linuxkit/init:7e3d51e6ab5896ecb36a4829450f7430f2878927
Image docker.io/linuxkit/init:7e3d51e6ab5896ecb36a4829450f7430f2878927 not found in local cache, pulling
Process init image: docker.io/linuxkit/runc:9f7aad4eb5e4360cc9ed8778a5c501cce6e21601
Process init image: docker.io/linuxkit/containerd:2f0907913dd54ab5186006034eb224a0da12443e
Process init image: docker.io/linuxkit/ca-certificates:c1c73ef590dffb6a0138cf758fe4a4305c9864f4
Add onboot containers:
  Create OCI config for linuxkit/sysctl:bdc99eeedc224439ff237990ee06e5b992c8c1ae
Image docker.io/linuxkit/sysctl:bdc99eeedc224439ff237990ee06e5b992c8c1ae not found in local cache, pulling
  Create OCI config for linuxkit/sysfs:c3bdb00c5e23bf566d294bafd5f7890ca319056f
Image docker.io/linuxkit/sysfs:c3bdb00c5e23bf566d294bafd5f7890ca319056f not found in local cache, pulling
  Create OCI config for linuxkit/format:7efa07559dd23cb4dbebfd3ab48c50fd33625918
Image docker.io/linuxkit/format:7efa07559dd23cb4dbebfd3ab48c50fd33625918 not found in local cache, pulling
  Create OCI config for linuxkit/mount:422b219bb1c7051096126ac83e6dcc8b2f3f1176
Image docker.io/linuxkit/mount:422b219bb1c7051096126ac83e6dcc8b2f3f1176 not found in local cache, pulling
Add service containers:
  Create OCI config for linuxkit/getty:3c6e89681a988c3d4e2610fcd7aaaa0247ded3ec
  Create OCI config for linuxkit/rngd:4f85d8de3f6f45973a8c88dc8fba9ec596e5495a
Image docker.io/linuxkit/rngd:4f85d8de3f6f45973a8c88dc8fba9ec596e5495a not found in local cache, pulling
  Create OCI config for linuxkit/dhcpcd:52d2c4df0311b182e99241cdc382ff726755c450
  Create OCI config for linuxkit/openntpd:d6c36ac367ed26a6eeffd8db78334d9f8041b038
  Create OCI config for docker:20.10.6-dind
Add files:
  var/lib/docker
  etc/docker/daemon.json
Create outputs:
  docker-kernel docker-initrd.img docker-cmdline
  1. Try and run it.
  2. Observe that docker isn't running:
(ns: getty) linuxkit-16c44dd030d2:~# nsenter -m -t 1
linuxkit-16c44dd030d2:/# ctr -n services.linuxkit task list
TASK      PID    STATUS    
getty     450    RUNNING
ntpd      543    RUNNING
rngd      594    STOPPED
dhcpcd    391    RUNNING

Its not running because the daemon.json file from the example config didn't get created:

linuxkit-16c44dd030d2:/# stat /etc/docker/daemon.json
stat: can't stat '/etc/docker/daemon.json': No such file or directory

In fact most of the container isn't there:

linuxkit-16c44dd030d2:/# ls /containers/services/docker/
lower
  1. Debug in sadness that things are breaking in very hard to troubleshoot ways.

Describe the results you received: I got a broken VM that didn't include all the stuff that should have been in there.

Describe the results you expected: I expected the example to work, I expected to get a VM with a working dockerd in it that I could then build further applications on top of.

Additional information you deem important (e.g. issue happens only occasionally): I maintain a terraform provider that serves as an alternate frontend to linuxkit. Right now its pinned at 5b7466732a90 which has the same problems when using linuxkit directly, but the terraform frontend does actually put the whole filesystem together. Dockerd however does not work in the resulting image. I think this ticket should stay scoped to the example being broken, but its also important to note that even if the build worked it still doesn't have a working dockerd in it.

the-maldridge avatar Jan 15 '22 05:01 the-maldridge

I'm still not sure why this fails to assemble with pure linuxkit, but I did ultimately track the issues with dockerd back to runc. Rolling back to linuxkit/runc:v0.8 fixes the issue, but I suspect this is not side-effect free.

the-maldridge avatar Jan 28 '22 07:01 the-maldridge

I tried rolling back to runc:v0.8 and it didn't resolve the issue for me. Could there be something else you tried that solved it?

I noticed that:

  • in debug mode the linuxkit build does appear to emit the relevant config.json but doesn't write the files to the image
  • none of the files defined in the files section are being put in the image

I also tried clearing the linuxkit cache and re-building but that didn't help.

jinnko avatar Feb 01 '22 21:02 jinnko

@jinnko my appologies for not being clear above. I cannot get the example to build on any version with LinuxKit directly. Building the terraform provider with current linuxkit master I can both build the image and make it work with v0.8 of runc.

the-maldridge avatar Feb 01 '22 21:02 the-maldridge

I am not convinced that is the issue. I did a build, and looked at the generated output files, specifically docker-initrd.img, which is just a gzipped cpio file.

I gunzipped it and extracted it, and I do see everything:

cpio -it < docker-initrd.cpio
...
...
containers/services/docker/lower/etc
containers/services/docker/lower/etc/hosts
containers/services/docker/lower/etc/resolv.conf
containers/services/docker/lower/proc
containers/services/docker/lower/sys
containers/services/docker/config.json
containers/services/docker/tmp
containers/services/docker/rootfs
containers/services/docker/runtime.json
var
var/lib
var/lib/docker
etc
etc/docker
etc/docker/daemon.json

This leads me to believe it is a runtime issue. How are you running it?

deitch avatar Feb 02 '22 09:02 deitch

I tried rerunning it, allocating 2048MB of memory instead of the default 1024. Works fine.

deitch avatar Feb 02 '22 09:02 deitch

When you run it as kernel+initrd, it all gets loaded into memory. If there isn't enough provided, it will fail.

deitch avatar Feb 02 '22 09:02 deitch

I am unable to duplicate your results. I build the linuxkit binary using the makefile and then run linuxkit build docker.yml. After doing this I run linuxkit run qemu -kernel docker -mem 4000. On doing this I still don't have a working set of image/kernel files.

Can you post the exact commands you are running?

the-maldridge avatar Feb 02 '22 17:02 the-maldridge

linuxkit run qemu -mem 2048 docker 

Nothing fancier

deitch avatar Feb 02 '22 17:02 deitch

Hmm. Running your exact command makes me think that flag parsing order matters here in a way it shouldn't. At any rate it now fails with the expected runc error.

the-maldridge avatar Feb 02 '22 17:02 the-maldridge

I am surprised that nothing gets logged here and this is a silent failure. Is it possible to detect this failure mode during boot?

the-maldridge avatar Feb 02 '22 18:02 the-maldridge