runc icon indicating copy to clipboard operation
runc copied to clipboard

runc seccomp test fails on s390x (`numerical argument out of domain`)

Open ricardobranco777 opened this issue 4 months ago • 1 comments

Description

runc seccomp test fails on s390x.

I currently don't have access to a s390x system with Tumbleweed to run this test.

Tracking in SUSE as https://bugzilla.suse.com/show_bug.cgi?id=1247567

Steps to reproduce the issue

### RUN AS root
curl -sL --retry 9 --retry-delay 100 --retry-max-time 900 https://github.com/bats-core/bats-core/archive/refs/tags/v1.11.1.tar.gz | tar -zxf -
bash bats-core-1.11.1/install.sh /usr/local
rm -rf bats-core-1.11.1
mkdir -pm 0750 /etc/sudoers.d/
echo 'Defaults secure_path="/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/bin"' > /etc/sudoers.d/usrlocal
zypper --gpg-auto-import-keys -n install glibc-devel-static git-core go1.24 jq libseccomp-devel make runc
mkdir -p /etc/systemd/system/[email protected]/
echo -e "[Service]\nDelegate=cpu cpuset io memory pids" > /etc/systemd/system/[email protected]/60-delegate.conf
reboot
### RUN AS user
cd /var/tmp/
git clone https://github.com/opencontainers/runc.git
cd /var/tmp/runc
git checkout v1.2.6
curl -sL --retry 9 --retry-delay 100 --retry-max-time 900 -O make memfd-bind fs-idmap pidfd-kill recvtty remap-rootfs sd-helper seccompagent || true
### RUN AS root
cd /var/tmp/runc
mkdir -p /var/tmp/test.bhMPpb
env BATS_TMPDIR=/var/tmp/test.bhMPpb PATH=/usr/local/bin:$PATH:/usr/sbin:/sbin RUNC=/usr/bin/runc RUNC_USE_SYSTEMD=1 bats --tap -T tests/integration/seccomp.bats | tee -a runc-root.tap
sudo rm -rf /var/tmp/test.bhMPpb || true

Describe the results you received and expected

not ok 209 runc run [seccomp -ENOSYS handling] in 699ms
# (in test file tests/integration/seccomp.bats, line 22)
#   `[ "$status" -eq 0 ]' failed
# runc spec (status=0):
#
# runc run test_busybox (status=1):
# time="2025-08-04T21:13:25+02:00" level=warning msg="unable to get oom kill count" error="openat2 /sys/fs/cgroup/system.slice/runc-test_busybox.scope/memory.events: no such file or directory"
# time="2025-08-04T21:13:25+02:00" level=error msg="runc run failed: unable to start container process: error during container init: unable to init seccomp: error adding architecture to seccomp filter: numerical argument out of domain"
# --- teardown ---
not ok 210 runc run [seccomp defaultErrnoRet=ENXIO] in 561ms
# (in test file tests/integration/seccomp.bats, line 34)
#   `[ "$status" -eq 0 ]' failed
# runc spec (status=0):
#
# runc run test_busybox (status=1):
# time="2025-08-04T21:13:26+02:00" level=error msg="runc run failed: unable to start container process: error during container init: unable to init seccomp: error adding architecture to seccomp filter: numerical argument out of domain"
# --- teardown ---
not ok 211 runc run [seccomp] (SCMP_ACT_ERRNO default) in 549ms
# (in test file tests/integration/seccomp.bats, line 52)
#   `[[ "$output" == *"mkdir:"*"/dev/shm/foo"*"Operation not permitted"* ]]' failed
# runc spec (status=0):
#
# runc run test_busybox (status=1):
# time="2025-08-04T21:13:27+02:00" level=error msg="runc run failed: unable to start container process: error during container init: error adding architecture to seccomp filter: numerical argument out of domain"
# --- teardown ---
not ok 212 runc run [seccomp] (SCMP_ACT_ERRNO explicit errno) in 564ms
# (in test file tests/integration/seccomp.bats, line 66)
#   `[[ "$output" == *"Network is down"* ]]' failed
# runc spec (status=0):
#
# runc run test_busybox (status=1):
# time="2025-08-04T21:13:27+02:00" level=error msg="runc run failed: unable to start container process: error during container init: error adding architecture to seccomp filter: numerical argument out of domain"
# --- teardown ---
not ok 213 runc run [seccomp] (SECCOMP_FILTER_FLAG_*) in 705ms
# (in test file tests/integration/seccomp.bats, line 146)
#   `[[ "$output" == *"mkdir:"*"/dev/shm/foo"*"Operation not permitted"* ]]' failed
# runc spec (status=0):
#
# runc --debug run test_busybox (status=1):
# time="2025-08-04T21:13:28+02:00" level=debug msg="F_GET_SEALS on /proc/self/exe failed: invalid argument" func="libcontainer/dmz.IsCloned()" file="libcontainer/dmz/cloned_binary_linux.go:206"
# time="2025-08-04T21:13:28+02:00" level=debug msg="runc-dmz: using overlayfs for sealed /proc/self/exe" func="libcontainer/dmz.CloneSelfExe()" file="libcontainer/dmz/cloned_binary_linux.go:231"
# time="2025-08-04T21:13:28+02:00" level=debug msg="runc-dmz: using /proc/self/exe clone" func="libcontainer.(*Container).newParentProcess()" file="libcontainer/container_linux.go:507"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec[48660]: => nsexec container setup"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: ~> nsexec stage-0"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: spawn stage-1"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: -> stage-1 synchronisation loop"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-1[48663]: ~> nsexec stage-1"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-1[48663]: unshare remaining namespaces"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-1[48663]: spawn stage-2"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-2[1]: ~> nsexec stage-2"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-1[48663]: request stage-0 to forward stage-2 pid (48664)"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: stage-1 requested pid to be forwarded"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: forward stage-1 (48663) and stage-2 (48664) pids to runc"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-1[48663]: signal completion to stage-0"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-1[48663]: <~ nsexec stage-1"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: stage-1 complete"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: <- stage-1 synchronisation loop"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: -> stage-2 synchronisation loop"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: signalling stage-2 to run"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-2[1]: signal completion to stage-0"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-2[1]: <= nsexec container setup"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-2[1]: booting up go runtime ..."
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: stage-2 complete"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: <- stage-2 synchronisation loop"
# time="2025-08-04T21:13:28+02:00" level=debug msg="nsexec-0[48660]: <~ nsexec stage-0"
# time="2025-08-04T21:13:28+02:00" level=debug msg="mount source thread: successfully running in container mntns"
# time="2025-08-04T21:13:28+02:00" level=debug msg="reading sync" func="libcontainer.doReadSync()" file="libcontainer/sync.go:127"
# time="2025-08-04T21:13:28+02:00" level=debug msg="child process in init()"
# time="2025-08-04T21:13:28+02:00" level=debug msg="writing sync type:procHooks"
# time="2025-08-04T21:13:28+02:00" level=debug msg="reading sync"
# time="2025-08-04T21:13:28+02:00" level=debug msg="read sync type:procHooks" func="libcontainer.doReadSync()" file="libcontainer/sync.go:139"
# time="2025-08-04T21:13:28+02:00" level=debug msg="writing sync type:procHooksDone"
# time="2025-08-04T21:13:28+02:00" level=debug msg="reading sync"
# time="2025-08-04T21:13:28+02:00" level=debug msg="read sync type:procHooksDone" func="libcontainer/logs.processEntry()" file="libcontainer/logs/logs.go:55"
# time="2025-08-04T21:13:28+02:00" level=debug msg="writing sync type:procReady" func="libcontainer/logs.processEntry()" file="libcontainer/logs/logs.go:55"
# time="2025-08-04T21:13:28+02:00" level=debug msg="read sync type:procReady"
# time="2025-08-04T21:13:28+02:00" level=debug msg="reading sync" func="libcontainer/logs.processEntry()" file="libcontainer/logs/logs.go:55"
# time="2025-08-04T21:13:28+02:00" level=debug msg="writing sync type:procRun"
# time="2025-08-04T21:13:28+02:00" level=debug msg="reading sync"
# time="2025-08-04T21:13:28+02:00" level=debug msg="read sync type:procRun" func="libcontainer/logs.processEntry()" file="libcontainer/logs/logs.go:55"
# time="2025-08-04T21:13:28+02:00" level=debug msg="writing sync type:procError arg:{\"message\":\"error adding architecture to seccomp filter: numerical argument out of domain\"}" func="libcontainer/logs.processEntry()" file="libcontainer/logs/logs.go:55"
# time="2025-08-04T21:13:28+02:00" level=debug msg="read sync type:procError arg:{\"message\":\"error adding architecture to seccomp filter: numerical argument out of domain\"}"
# time="2025-08-04T21:13:28+02:00" level=debug msg="mount source thread: closing thread: context canceled" func="libcontainer.(*initProcess).goCreateMountSources.func1()" file="libcontainer/process_linux.go:485"
# time="2025-08-04T21:13:28+02:00" level=error msg="runc run failed: unable to start container process: error during container init: error adding architecture to seccomp filter: numerical argument out of domain"
# --- teardown ---
ok 214 runc run [seccomp] (SCMP_ACT_KILL) in 472ms
not ok 215 runc run [seccomp] (startContainer hook) in 489ms
# (in test file tests/integration/seccomp.bats, line 185)
#   `[[ "$output" == *"error running startContainer hook"* ]]' failed
# runc spec (status=0):
#
# runc run test_busybox (status=1):
# time="2025-08-04T21:13:29+02:00" level=error msg="runc run failed: unable to start container process: error during container init: unable to init seccomp: error adding architecture to seccomp filter: numerical argument out of domain"
# --- teardown ---

What version of runc are you using?

runc version 1.2.6 commit: v1.2.6-0-ge89a29929c77 spec: 1.2.0 go: go1.24.5 libseccomp: 2.6.0

Host OS information

SUSE Linux Enterprise Server 16.0 PublicRC

Host kernel information

Linux sle16 6.12.0-160000.20-default #1 SMP PREEMPT_DYNAMIC Mon Jul 21 10:20:07 UTC 2025 (b00eabe) x86_64 x86_64 x86_64 GNU/Linux

ricardobranco777 avatar Aug 04 '25 19:08 ricardobranco777

Hm.

So it seems that in libseccomp 2.4.0, they changed the behaviour of seccomp_arch_add() such that -EDOM will be returned if the endianness of the existing filters and the architecture attempting to be added do not match. Previously, -EEXIST was returned, which libseccomp-golang silently masks because it assumes that -EEXIST indicates that the filter already contains the architecture.

Nobody has mentioned this in the past 5 years of it being broken, so I guess that defining every architecture in the architecture set is not a common thing -- but I was personally not aware that this was a limitation of libseccomp. I assumed that they would generate two separate filters for each architecture with an if statement to switch between them.

We do not currently have any CI for big-endian architectures, which is why we missed this test failing. I guess the tests should be adjusted so that we enable multiple architectures but only ones that match the endianness of the native architecture.

cyphar avatar Aug 05 '25 04:08 cyphar