Support for 4.x kernels has been dropped?
I can't see anything obvious in the changelogs but it looks like at some point after 1.18.9 support for Linux 4.x Kernels was dropped. We currently run some clusters that have a combination of 4.19.0-19 and 5.10.0-29 kernels but the clusters with 4.x kernels are now failing do deploy the node agent with the following log output.
I0705 09:18:07.156568 85825 net.go:20] whitelisted public IPs: [0.0.0.0/0]
I0705 09:18:07.156905 85825 net.go:32] ephemeral-port-range: 32768-60999
I0705 09:18:07.164387 85825 cilium.go:30] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_ct4_global: no such file or directory
I0705 09:18:07.164448 85825 cilium.go:36] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_ct6_global: no such file or directory
I0705 09:18:07.164460 85825 cilium.go:43] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb4_backends_v2: no such file or directory
I0705 09:18:07.164472 85825 cilium.go:43] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb4_backends_v3: no such file or directory
I0705 09:18:07.164483 85825 cilium.go:52] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb6_backends_v2: no such file or directory
I0705 09:18:07.164491 85825 cilium.go:52] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb6_backends_v3: no such file or directory
I0705 09:18:07.167570 85825 main.go:111] agent version: 1.20.3
I0705 09:18:07.167635 85825 main.go:117] hostname: xxxxxxxxx-worker-1
I0705 09:18:07.167644 85825 main.go:118] kernel version: 4.19.0-18-amd64
I0705 09:18:07.169872 85825 main.go:75] machine-id: xxxxxxxxxxxxxxxxx
I0705 09:18:07.169971 85825 tracing.go:37] OpenTelemetry traces collector endpoint: http://coroot:8080/v1/traces
I0705 09:18:07.170090 85825 otel.go:29] OpenTelemetry logs collector endpoint: http://coroot:8080/v1/logs
I0705 09:18:07.170401 85825 metadata.go:67] cloud provider:
I0705 09:18:07.170419 85825 collector.go:157] instance metadata: <nil>
I0705 09:18:07.170670 85825 profiling.go:52] profiles endpoint: http://coroot:8080/v1/profiles
E0705 09:18:07.198354 85825 profiling.go:100] load bpf objects: field DisassociateCtty: program disassociate_ctty: apply CO-RE relocations: load kernel spec: no BTF found for kernel version 4.19.0-18-amd64: not supported
E0705 09:18:07.198354 85825 profiling.go:100] load bpf objects: field DisassociateCtty: program disassociate_ctty: apply CO-RE relocations: load kernel spec: no BTF found for kernel version 4.19.0-18-amd64: not supported
E0705 09:18:07.198354 85825 profiling.go:100] load bpf objects: field DisassociateCtty: program disassociate_ctty: apply CO-RE relocations: load kernel spec: no BTF found for kernel version 4.19.0-18-amd64: not supported
E0705 09:18:07.198354 85825 profiling.go:100] load bpf objects: field DisassociateCtty: program disassociate_ctty: apply CO-RE relocations: load kernel spec: no BTF found for kernel version 4.19.0-18-amd64: not supported
I0705 09:18:10.202542 85825 containerd.go:38] using /run/containerd/containerd.sock
W0705 09:18:10.202604 85825 registry.go:85] stat /proc/1/root/var/run/crio/crio.sock: no such file or directory
W0705 09:18:10.202604 85825 registry.go:85] stat /proc/1/root/var/run/crio/crio.sock: no such file or directory
E0705 09:18:10.234982 85825 tracer.go:191] load program: argument list too long:
E0705 09:18:10.234982 85825 tracer.go:191] load program: argument list too long:
E0705 09:18:10.234982 85825 tracer.go:191] load program: argument list too long:
E0705 09:18:10.234982 85825 tracer.go:191] load program: argument list too long:
F0705 09:18:10.235037 85825 main.go:149] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
F0705 09:18:10.235037 85825 main.go:149] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
F0705 09:18:10.235037 85825 main.go:149] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
F0705 09:18:10.235037 85825 main.go:149] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
F0705 09:18:10.235037 85825 main.go:149] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
It wasn't intentional. We added an eBPF program with more instructions than the others. Kernel 4.19 has a lower limit for the number of instructions in eBPF programs
Are there plans to try and support 4.x kernels again or should the minimum requirements listed in the readme be updated?
It uses eBPF to track container related events such as TCP connects, so the minimum supported Linux kernel version is 4.16.
It seems to be caused by this code, which will unfold two very long instructions.
SEC("tracepoint/syscalls/sys_enter_sendmmsg")
int sys_enter_sendmmsg(struct trace_event_raw_sys_enter_rw__stub* ctx) {
__u64 offset = 0;
#pragma unroll
for (int i = 0; i <= 1; i++) {
if (i >= ctx->size) {
break;
}
struct mmsghdr h = {};
if (bpf_probe_read(&h , sizeof(h), (void *)(ctx->buf + offset))) {
return 0;
}
offset += sizeof(h);
trace_enter_write(ctx, ctx->fd, 0, (char*)h.msg_hdr.msg_iov, 0, h.msg_hdr.msg_iovlen);
}
return 0;
}
It wasn't intentional. We added an eBPF program with more instructions than the others. Kernel 4.19 has a lower limit for the number of instructions in eBPF programs
Coroot site (https://docs.coroot.com/installation/requirements) has yet correct "minimum supported Linux kernel version is 4.16" as installation requirement, is this problem only apply to 4.19 Kernel?
issue is still on lastest version
I0411 04:43:43.554720 78820 net.go:24] whitelisted public IPs: [0.0.0.0/0]
I0411 04:43:43.554812 78820 net.go:36] ephemeral-port-range: 32768-60999
I0411 04:43:43.560298 78820 cilium.go:30] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_ct4_global: no such file or directory
I0411 04:43:43.560338 78820 cilium.go:36] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_ct6_global: no such file or directory
I0411 04:43:43.560354 78820 cilium.go:43] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb4_backends_v2: no such file or directory
I0411 04:43:43.560367 78820 cilium.go:43] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb4_backends_v3: no such file or directory
I0411 04:43:43.560382 78820 cilium.go:52] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb6_backends_v2: no such file or directory
I0411 04:43:43.560392 78820 cilium.go:52] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb6_backends_v3: no such file or directory
I0411 04:43:43.563772 78820 main.go:108] agent version: 1.23.12
I0411 04:43:43.563852 78820 main.go:114] hostname: localhost.localdomain
I0411 04:43:43.563861 78820 main.go:115] kernel version: 4.19.90-2107.6.0.0192.8.oe1.bclinux.aarch64
I0411 04:43:43.566753 78820 main.go:72] machine-id: 7fb0d43edf154cd3a974330c98b8cacb
I0411 04:43:43.566822 78820 tracing.go:40] OpenTelemetry traces collector endpoint:
I0411 04:43:43.566926 78820 otel.go:30] OpenTelemetry logs collector endpoint:
I0411 04:43:43.567135 78820 metadata.go:74] cloud provider:
I0411 04:43:43.567149 78820 collector.go:157] instance metadata: <nil>
I0411 04:43:43.567325 78820 profiling.go:55] profiles endpoint:
E0411 04:43:43.589643 78820 profiling.go:103] load bpf objects: field DisassociateCtty: program disassociate_ctty: map .rodata: map create: read- and write-only maps not supported (requires >= v5.2)
E0411 04:43:43.589643 78820 profiling.go:103] load bpf objects: field DisassociateCtty: program disassociate_ctty: map .rodata: map create: read- and write-only maps not supported (requires >= v5.2)
E0411 04:43:43.589643 78820 profiling.go:103] load bpf objects: field DisassociateCtty: program disassociate_ctty: map .rodata: map create: read- and write-only maps not supported (requires >= v5.2)
E0411 04:43:43.589643 78820 profiling.go:103] load bpf objects: field DisassociateCtty: program disassociate_ctty: map .rodata: map create: read- and write-only maps not supported (requires >= v5.2)
I0411 04:43:43.589882 78820 cgroup_linux.go:51] cgroup v2 root is /host/sys/fs/cgroup
W0411 04:43:47.594161 78820 registry.go:93] couldn't connect to containerd through the following UNIX sockets [/var/snap/microk8s/common/run/containerd.sock,/run/k0s/containerd.sock,/run/k3s/containerd/containerd.sock,/run/containerd/containerd.sock]: failed to dial "/proc/1/root/run/containerd/containerd.sock": context deadline exceeded
W0411 04:43:47.594161 78820 registry.go:93] couldn't connect to containerd through the following UNIX sockets [/var/snap/microk8s/common/run/containerd.sock,/run/k0s/containerd.sock,/run/k3s/containerd/containerd.sock,/run/containerd/containerd.sock]: failed to dial "/proc/1/root/run/containerd/containerd.sock": context deadline exceeded
I0411 04:43:47.594208 78820 crio.go:58] cri-o socket:
I0411 04:43:47.595520 78820 tracer.go:96] L7 tracing is disabled
E0411 04:43:48.200530 78820 tracer.go:213] load program: argument list too long:
E0411 04:43:48.200530 78820 tracer.go:213] load program: argument list too long:
E0411 04:43:48.200530 78820 tracer.go:213] load program: argument list too long:
E0411 04:43:48.200530 78820 tracer.go:213] load program: argument list too long:
F0411 04:43:48.200595 78820 main.go:146] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
F0411 04:43:48.200595 78820 main.go:146] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
F0411 04:43:48.200595 78820 main.go:146] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
F0411 04:43:48.200595 78820 main.go:146] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
F0411 04:43:48.200595 78820 main.go:146] failed to load collection: program sys_enter_sendmmsg: load program: argument list too long
[sw@localhost docker_compose]$ uname -a
Linux localhost.localdomain 4.19.90-2107.6.0.0192.8.oe1.bclinux.aarch64 #1 SMP Tue Mar 21 09:23:05 CST 2023 aarch64 aarch64 aarch64 GNU/Linux
Given that Linux Kernel 4.19 is now in super long support and not having new features added I don't think support for 4.x kernels will be coming back. It'd be great if @def could confirm this though, and update the minimum requirements.
We have long since moved on to 5.x and 6.x kernels without issues.
@FutureMatt, we'd love to fix that! But on the first attempt, we couldn’t find an easy way to reduce the number of instructions. So you're right, we should update the requirements in the docs. Would you be open to contributing to this? :) https://github.com/coroot/coroot/blob/main/docs/docs/installation/requirements.md
Sure, I'm happy to but can you confirm what the minimum Kernel is now, is it just 5.x or are the requirements deeper than that?
According to Cilium's docs:
The maximum instruction limit per program is restricted to 4096 BPF instructions, which, by design, means that any program will terminate quickly. For kernel newer than 5.1 this limit was lifted to 1 million BPF instructions.