sysbox
sysbox copied to clipboard
Support for 32bit applications in a 64bit container
Hi @ctalledo,
i am encountering another problem.
here you can see a container that with sysbox produces Bad system call (core dumped)
when running ./bytecode_builtins_list_generator
.
that app is a 32 bit app and my container is 64 bit
file ./bytecode_builtins_list_generator
./bytecode_builtins_list_generator: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV),
dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 3.2.0,
BuildID[sha1]=e3f1ebe53993fc8339b9686b316830ea4b64452a, with debug_info, not stripped
Steps to reproduce:
-
sudo DOCKER_BUILDKIT=1 docker build -t uazo/test32bit .
-
sudo docker run --runtime=sysbox-runc -ti --rm uazo/test32bit
-
./bytecode_builtins_list_generator
running it without sysbox works perfectly. is there any way to enable with sysbox 32bit application support in a 64bit container?
thank you
@uazo, i was able to reproduce the issue but haven't figured out its root-cause yet. I also noticed that problem is not seen when relying on the oci-runc (with and without user-ns).
Problem seems to be related to a recvfrom() syscall (id=45) executed as part of this binary and prevented (apparently) by kernel's seccomp module. Please verify that you are also seeing this in your journald
:
Jul 15 00:43:47 ubuntu-focal-vm audit[6192]: SECCOMP auid=4294967295 uid=165536 gid=165536 ses=4294967295 pid=6192 comm="bytecode_builti" exe="/bytecode_builtins_list_generator" sig=31 arch=40000003 syscall=45 compat=1 ip=0xf7f70e3b code=0x0
Jul 15 00:43:47 ubuntu-focal-vm kernel: audit: type=1326 audit(1626309827.690:18): auid=4294967295 uid=165536 gid=165536 ses=4294967295 pid=6192 comm="bytecode_builti" exe="/bytecode_builtins_list_generator" sig=31 arch=40000003 syscall=45 compat=1 ip=0xf7f70e3b code=0x0
Strace capture. Crash is triggered early on, right after execve() + brk() execution:
[pid 6320] rt_sigaction(SIGXFSZ, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, NULL, 8) = 0
[pid 6320] rt_sigaction(SIGVTALRM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, NULL, 8) = 0
[pid 6320] rt_sigaction(SIGUSR1, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, NULL, 8) = 0
[pid 6320] rt_sigaction(SIGUSR2, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, NULL, 8) = 0
[pid 6320] rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, {sa_handler=0x5598d0380b30, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, 8) = 0
[pid 6320] rt_sigaction(SIGQUIT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, 8) = 0
[pid 6320] rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7feb88eed210}, {sa_handler=0x5598d0380610, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7feb88eed210}, 8) = 0
[pid 6320] rt_sigaction(SIGCHLD, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7feb88eed210}, {sa_handler=0x5598d0363aa0, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7feb88eed210}, 8) = 0
[pid 6320] execve("./bytecode_builtins_list_generator", ["./bytecode_builtins_list_generat"...], 0x5598d0f593b0 /* 8 vars */) = 0
strace: [ Process PID=6320 runs in 32 bit mode. ]
[pid 6320] brk(NULL) = ?
[pid 6320] +++ killed by SIGSYS (core dumped) +++
<... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGSYS && WCOREDUMP(s)}], WSTOPPED|WCONTINUED, NULL) = 58
rt_sigprocmask(SIG_BLOCK, [CHLD TSTP TTIN TTOU], [CHLD], 8) = 0
ioctl(255, TIOCSPGRP, [1]) = 0
rt_sigprocmask(SIG_SETMASK, [CHLD], NULL, 8) = 0
ioctl(255, TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(255, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = 0
ioctl(255, TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(255, TIOCGWINSZ, {ws_row=70, ws_col=239, ws_xpixel=0, ws_ypixel=0}) = 0
write(2, "Bad system call (core dumped)\n", 30) = 30
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
thanks @rodnymolina for investigating!
I confirm:
Jul 15 05:50:07 ay audit[547883]: SECCOMP auid=4294967295 uid=165536 gid=165536 ses=4294967295
pid=547883 comm="bytecode_builti" exe="/bytecode_builtins_list_generator" sig=31 arch=40000003 syscall=45
compat=1 ip=0xf7f51e3b code=0x0
Jul 15 05:50:07 ay kernel: audit: type=1326 audit(1626328207.689:1679): auid=4294967295 uid=165536
gid=165536 ses=4294967295 pid=547883 comm="bytecode_builti" exe="/bytecode_builtins_list_generator"
sig=31 arch=40000003 syscall=45 compat=1 ip=0xf7f51e3b code=0x0
in your opinion, for what you've seen so far, can it be fixed? if it could be fixed, waiting for the fix, is there any temporary workaround (even at the expense of security, I am in the test phase for now) to be able to continue my work?
Strace capture. Crash is triggered early on, right after execve() + brk() execution:
it's beyond my capabilities, but if I can help you in any way, please tell me. I don't think it will be useful to you, but here you find the sources
I have the same problem, but I have slightly different strace output so I thought I'd share it here
$ strace /tmp/32bin
execve("/tmp/32bin", ["/tmp/32bin"], 0x7fff270578b0 /* 64 vars */) = 0
strace: [ Process PID=45108 runs in 32 bit mode. ]
set_thread_area({entry_number=-1, base_addr=0x9a85810, limit=0x0fffff, seg_32bit=1, contents=0, read_exec_only=0, limit_in_pages=1, seg_not_present=0, useable=1} <unfinished ...>) = ?
+++ killed by SIGSYS +++
zsh: invalid system call strace /tmp/32bin
Jan 29 01:03:37 gke-master-sydney-pool-3-adc98a83-so64 audit[3565791]: SECCOMP auid=101000 uid=101000 gid=101000 ses=276 pid=3565791 comm="32bin" exe="/tmp/32bin" sig=31 arch=40000003 syscall=243 compat=1 ip=0x80aec82 code=0x0
Jan 29 01:03:37 gke-master-sydney-pool-3-adc98a83-so64 audit[3565791]: ANOM_ABEND auid=101000 uid=101000 gid=101000 ses=276 pid=3565791 comm="32bin" exe="/tmp/32bin" sig=31 res=1
@rodnymolina, do you know if there is a workaround and/or plans to fix this, we are blocked by this problem?
hi @isarkis, apologies for the delayed response but @rodnymolina has been out of office the last couple of weeks.
I took a brief look at this issue a couple of days ago but did not spot anything obvious. We will take a closer look next week. I suspect the problem is in the way Sysbox is applying the seccomp filters.
Thanks for giving Sysbox a shot in your infra.
@ctalledo, any luck fixing this issue?
Hi @isarkis, my apologies but I've been swamped with other Sysbox related work and have not had a chance to look into this yet.
Will do my best to get to it this week, thanks for your patience.