zfs trouble on ARM64: segmentation fault
This is not great.. How would I even go about debugging this as Talos doesn't properly boot as a result?
Running on Oracle Ampere instance.
user: warning: [2025-01-20T09:56:31.57410882Z]: [talos] [initramfs] enabling system extension zfs 2.2.7-v1.9.2
user: warning: [2025-01-20T09:56:32.18043182Z]: [talos] service[ext-zfs-service](Starting): Starting service
user: warning: [2025-01-20T09:56:32.18533482Z]: [talos] service[ext-zfs-service](Waiting): Waiting for service "containerd" to be "up", service "udevd" to be "up", service "cri" to be "up", file "/dev/zfs" to exist
kern: warning: [2025-01-20T09:56:32.64627282Z]: zfs: module license 'CDDL' taints kernel.
kern: warning: [2025-01-20T09:56:32.65022382Z]: zfs: module license taints kernel.
user: warning: [2025-01-20T09:56:33.19103082Z]: [talos] service[ext-zfs-service](Waiting): Waiting for service "containerd" to be "up", service "udevd" to be "up", service "cri" to be registered, file "/dev/zfs" to exist
kern: notice: [2025-01-20T09:56:33.27346082Z]: ZFS: Loaded module v2.2.7-1, ZFS pool version 5000, ZFS filesystem version 5
user: warning: [2025-01-20T09:56:34.19160382Z]: [talos] service[ext-zfs-service](Waiting): Waiting for service "cri" to be registered
user: warning: [2025-01-20T09:56:34.96757382Z]: [talos] task startAllServices (1/1): service "apid" to be "up", service "auditd" to be "up", service "containerd" to be "up", service "cri" to be "up", service "etcd" to be "up", service "ext-iscsid" to be "up", service "ext-tgtd" to be "up", service "ext-zfs-service" to be "up", service "kubelet" to be "up", service "machined" to be "up", service "syslogd" to be "up", service "trustd" to be "up", service "udevd" to be "up"
user: warning: [2025-01-20T09:56:35.19158382Z]: [talos] service[ext-zfs-service](Waiting): Waiting for service "cri" to be "up"
user: warning: [2025-01-20T09:56:35.97000382Z]: [talos] service[ext-zfs-service](Preparing): Running pre state
user: warning: [2025-01-20T09:56:35.97765882Z]: [talos] service[ext-zfs-service](Preparing): Creating service runner
user: warning: [2025-01-20T09:56:36.06776182Z]: [talos] service[ext-zfs-service](Running): Started task ext-zfs-service (PID 5315) for container ext-zfs-service
user: warning: [2025-01-20T09:56:36.52519982Z]: [talos] service[ext-zfs-service](Waiting): Error running Containerd(ext-zfs-service), going to restart until it succeeds: task "ext-zfs-service" failed: exit code 1
user: warning: [2025-01-20T09:56:41.59867282Z]: [talos] service[ext-zfs-service](Running): Started task ext-zfs-service (PID 5621) for container ext-zfs-service
talosctl logs ext-zfs-service:
0 / 0 keys successfully loaded
2025/01/20 09:56:36 zfs-service: zpool import error: signal: segmentation fault
no pools available to import
This suggests the zpool program is crashing. You can spawn a privileged system pod and try to debug zpool, or try to install zpool in that system pod (using the distro’s package manager) and see if that also crashes. I’ve run zfs commands inside pods created by https://github.com/kvaps/kubectl-node-shell .
The wierd thing is it did manage to mount my pool..
The ZFS binary seems to be segfaulting while zpool binary is fine.
Did you end up finding a solution? I have three Dell R630's and one out of my three nodes is having this same issue when starting up a brand new cluster
Did you end up finding a solution? I have three Dell R630's and one out of my three nodes is having this same issue when starting up a brand new cluster
It managed to mount the pool so I dunno what the problem was about and am able to schedule pods and things.
Hello. I get the same error from "talosctl logs ext-zfs-service" on a Raspberry Pi 4. Both on Talos 1.9.1 and 1.9.2. Segfaults ain’t fun.
If I could get some pointers I’d love to help out, sharing some logs etc.
This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue was closed because it has been stalled for 7 days with no activity.