darling icon indicating copy to clipboard operation
darling copied to clipboard

Semi-working in containerd/Kubernetes, "Cannot open mnt namespace file: No such file or directory"

Open rainbowcardiod opened this issue 1 year ago • 1 comments

I am trying to run Darling inside a container, with kubernetes. By adding mount -t tmpfs tmpfs /root/ in the Dockerfile ENTRYPOINT script, darling is able to mount its things, and it works. It works in some servers, but in other servers it does not. This is the final error given by darling shell.

Setting up a new Darling prefix at /root/.darling
Cannot open mnt namespace file: No such file or directory

More details: from strace -v -s 256 -f darling shell darling is able to mount its overlay:

[pid 87428] mount("overlay", "/root/.darling", "overlay", 0, "lowerdir=/usr/local/libexec/darling,upperdir=/root/.darling,workdir=/root/.darling.workdir,index=off") = 0

However, on some servers, darlingserver crashes:

...
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f19f6dd4a10) = 87428
...
[pid 87428] execve("/usr/local/bin/darlingserver", ["darlingserver", "/root/.darling", "0", "0", "4", "0"], ...
...
[pid 87428] openat(AT_FDCWD, "/sys/devices/system/cpu/online", O_RDONLY|O_CLOEXEC <unfinished ...>
...
[pid 87428] brk(0x5635ab0bb000)         = 0x5635ab0bb000
[pid 87428] brk(0x5635ab0dc000)         = 0x5635ab0dc000
[pid 87428] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x5635a9c7e000} ---
[pid 87444] <... futex resumed>)        = ?
[pid 87444] +++ killed by SIGSEGV (core dumped) +++
[pid 87428] +++ killed by SIGSEGV (core dumped) +++
...

From dmesg:

[Tue Dec 19 16:27:16 2023] overlayfs: upperdir is in-use as upperdir/workdir of another mount, accessing files from both mounts will result in undefined behavior.
[Tue Dec 19 16:27:16 2023] overlayfs: workdir is in-use as upperdir/workdir of another mount, accessing files from both mounts will result in undefined behavior.
[Tue Dec 19 16:27:16 2023] darlingserver[xxxx]: segfault at 55a96181d000 ip 000055a9616c3b64 sp 00007ffd79f0a8d0 error 6 in darlingserver[55a9615c4000+182000]
[Tue Dec 19 16:27:16 2023] Code: 29 ff ff 48 89 45 d8 48 89 55 e0 48 8b 45 d8 48 89 45 e8 48 8b 45 e8 48 89 45 f0 48 8b 55 f0 48 8b 4d f8 48 8d 05 ac e5 13 00 <48> 89 14 c8 48 8b 7d f0 31 f6 ba a0 04 00 00 e8 b8 09 f0 ff 48 8b

I tried to use DPREFIX but to no avail

    mount -t tmpfs tmpfs /root
    mkdir /root/darling
    export DPREFIX=/root/darling

The only improvement is that dmesg does not warn anymore with overlayfs: workdir is in-use as upperdir/workdir of another mount, accessing files from both mounts will result in undefined behavior.. However, the behavior of strace remains the same.

Note well that on some servers everything works as expected.

For all the above, I believe it is a bug of darlingserver.

containerd versions used: 1.6.8, 1.6.9, host system ubuntu 22.04.01 servers with kernel version: linux 5.15.0-71-generic does not work linux 5.4.0-155-generic does not work linux 5.4.0-89-generic does work linux 5.15.0-56-generic does work (so it does not seems the kernel version the problem)

This is an excerpt from the dockerfile used to create the image. Note the multistage build is to make the image smaller. excerpt.dockerfile.txt

rainbowcardiod avatar Dec 19 '23 16:12 rainbowcardiod

@rainbowcardiod : This worked for me: docker run -it --privileged [image_name] bash

followed by: mkdir -p /root/overlay &&
mount -t tmpfs tmpfs /root/overlay &&
export DPREFIX=/root/overlay/.darling

PS: if you don't run privileged it won't be able to mount You can also build a Dockerfile with the latest version of ubuntu + latest version of Darling, but you will have to compile it.

humble-desser avatar Dec 21 '23 17:12 humble-desser