Caps are still dropped when setting kernelArg 'kexec_load_disabled=1'
Bug Report
A possible regression, v0.14 introduced the ability to restore full capabilities when disabling kexec. As of v1.4, this feature no longer works without mention in the release notes.
This may have been an unintended side-effect of cc6e37a which no longer checks for the kernel arguments. Should the former behavior be restored to permit DinD, or is there an alternative suggested mechanism?
Environment
Talos version: [talosctl version --nodes <problematic nodes>]
Client:
Tag: v1.9.4
SHA: c863a5617614722ba5d5ad6477593be9d40dd6cb
Built:
Go version: go1.24.0
OS/Arch: linux/amd64
Server:
NODE: 10.100.0.14
Tag: v1.9.4
SHA: c863a561
Built:
Go version: go1.23.6
OS/Arch: linux/amd64
Enabled: RBAC
Kubernetes version: [kubectl version]
Client Version: v1.32.2
Kustomize Version: v5.5.0
Server Version: v1.32.0
- Platform: Proxmox QEMU
Oh I glossed over it on my initial read, but it looks like it's expecting proc.sysctl.kernel.kexec_load_disabled. I'll update my machineconfig and double-check that it works, and will open a PR to add a note about this work-around to the Process Capabilities docs if it does!
dind works with capabilities dropped, you must be using a very old dind image, we use it all the time
The changed kernelArg didn't work, and with more digging, it looks like the check actually disappeared in 1.9 with 66012a7.
From talosctl -n 10.100.0.16 dmesg:
10.100.0.16: kern: notice: [2025-03-17T06:39:54.053328404Z]: Kernel command line: BOOT_IMAGE=/B/vmlinuz talos.platform=nocloud talos.config=none console=tty1 console=ttyS0 net.ifnames=0 net.ifnames=0 init_on_alloc=1 slab_nomerge pti=on consoleblank=0 nvme_core.io_timeout=4294967295 printk.devkmsg=on ima_template=ima-ng ima_appraise=fix ima_hash=sha512 proc.sys.kernel.kexec_load_disabled=1
And from a privileged pod:
root@talos-prod-worker03:~# getpcaps $$
46568: =ep cap_sys_module,cap_sys_boot-ep
dind works with capabilities dropped, you must be using a very old dind image, we use it all the time
Sorry, I should be more specific about the use-case: The Concourse worker spawns containerd within its pod. It then uses the nested containerd for its own workloads. It is failing with:
failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: unable to apply caps: operation not permitted: unknown
Which, along with the Process Capabilities docs, points to missing caps.
Looks like you're using very old container runtime in Concourse, this issue (proper list of capabilities) was fixed e.g. in Docker couple of years ago.
It's a bug to create a container which has more capabilities that the calling process has.
The versions of the runtimes in the container are only behind by a few weeks:
root@concourse-worker-0:/usr/local/concourse/bin# ./containerd --version
containerd github.com/containerd/containerd/v2 v2.0.2 c507a0257ea6462fbd6f5ba4f5c74facb04021f4
root@concourse-worker-0:/usr/local/concourse/bin# ./runc --version
runc version 1.2.4
commit: v1.2.4-0-g6c52b3fc
spec: 1.2.0
go: go1.22.10
libseccomp: 2.5.5
But my main question in regards to this bug report is whether or not the change in behavior within Talos was intentional, and if the original behavior should be restored.
Then it's a bug in Concourse itself (I'm not familiar with details). Either way, it should never try to run a container which capabilities not available to the process creating a container.
I appreciate your guidance regarding my use-case, and I'll take back my learnings from here to Concourse to see if I can provide a patch there. However, I want to make sure the focus on this issue is contained to the behavior of Talos.
A better summary with what I've learned throughout this issue:
Prior to #9489, adding the kernelArg of proc.sys.kernel.kexec_load_disabled=1 would cause Talos to not drop CAP_SYS_MODULES and CAP_SYS_BOOT when spawning its containerd/cri. The addition of the feature was documented in the release notes, but the removal was not mentioned anywhere.
This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.
Bump. Still struggling to figure out if there's a way to get Concourse to run on Talos, and whether or not kexec_load_disabled=1 works correctly and/or provides a solution. 😅
I ran into this issue with concourse as well. From what I am seeing there is still no workaround correct?