LVM activation is not entirely working
Bug Report
Description
Talos 1.8.1 improved lvm activation quite a bit, but I am still seeing some number of VGs missing when booting a node. These VGs are encrypted ceph OSDs, on NVMe drives. The machine has a fair number of IO devices (10 NVMe, 15 SATA hard drives), so maybe it's a race or time out.
The issue manifests as missing /dev nodes for some VGs, usually the same 2. Issuing a vgchange --refresh command in a node-shell creates the missing nodes and allows the Ceph OSDs to start.
Logs
I see one such log entry for each of the 8 Ceph OSDs:
192.168.1.13: user: warning: [2024-10-15T23:59:03.258425978Z]: [talos] activating LVM volume {"component": "controller-runtime", "controller": "block.LVMActivationController", "name": "ceph-41785a27-5010-4418-becf-d6d52fd93f9c"}
So the controller is not missing any. There are no additional logs or differences in the log lines between the VGs that do get a device nodes and those that don't.
Environment
- Talos version:
Client: Tag: v1.6.6 SHA: 7dceba06 Built: Go version: go1.21.8 X:loopvar OS/Arch: darwin/arm64 Server: NODE: 192.168.1.13 Tag: v1.8.1-jfroy.2 SHA: e398d8a0 Built: Go version: go1.22.8 OS/Arch: linux/amd64 Enabled: RBAC - Kubernetes version: 1.31.1
- Platform: baremetal amd64
See https://github.com/siderolabs/talos/pull/9422 and https://github.com/siderolabs/talos/pull/9333 for recent work in this area.
@jfroy as you seem to be running with your changes on top, just to make sure - do you have $PKGS matching v1.8.1, as there were several fixes in pkgs as well
@jfroy as you seem to be running with your changes on top, just to make sure - do you have
$PKGSmatching v1.8.1, as there were several fixes in pkgs as well
I believe I am. I've rebased my release branches on yours (release-1.8) and rebuilt. But I do not build PKGS (with few exceptions, eg base and kernel), so I assume that you've rebuild upstream. Maybe the PKGS ref is stale in my forks, I will check.
But the behavior has improved, which suggests I do have the current fixes. I used to see LVM controller errors and get none of the VGs activated. Now the controller seems to work and the VGs all get activated. It feels like this current issue may be an interaction with udev somehow missing or dropping events or not being aware somehow of all the activated VGs.
I've seen the big changes to udev coming and maybe that will help. I can try to rebase that on 1.8 and test, though I don't know how mature the patches are.
Or I could be wrong and this is something else. I also want to go see what that refresh command does. Maybe it will suggest some further line of investigation.
The way you described it I haven't seen while debugging this issue - the usual issue is that the detection logic falsely reports the volume as activated.
I can't reproduce the issue on both main and Talos 1.8.1. I tested with 4 NVMe drives and Ceph with encryption enabled. On any reboot all 3 worker nodes came up with all LVM volumes activated.
172.20.0.7 runtime DiscoveredVolume dm-0 1 disk 11 GB luks
172.20.0.7 runtime DiscoveredVolume dm-1 1 disk 11 GB luks
172.20.0.7 runtime DiscoveredVolume dm-2 1 disk 11 GB luks
172.20.0.7 runtime DiscoveredVolume dm-3 1 disk 11 GB luks
172.20.0.7 runtime DiscoveredVolume dm-4 1 disk 11 GB bluestore
172.20.0.7 runtime DiscoveredVolume dm-5 1 disk 11 GB bluestore
172.20.0.7 runtime DiscoveredVolume dm-6 1 disk 11 GB bluestore
172.20.0.7 runtime DiscoveredVolume dm-7 1 disk 11 GB bluestore
172.20.0.7 runtime DiscoveredVolume loop0 1 disk 77 MB squashfs
172.20.0.7 runtime DiscoveredVolume nvme0n1 1 disk 11 GB lvm2-pv 0PsH8G-afuZ-mS2J-A35D-FirN-Qpfd-OyTv2a
172.20.0.7 runtime DiscoveredVolume nvme0n2 1 disk 11 GB lvm2-pv QFHt0k-ka8i-KX1t-kTGk-WWww-SGCK-GPpHZ8
172.20.0.7 runtime DiscoveredVolume nvme0n3 1 disk 11 GB lvm2-pv K3nU4P-u2hf-6wAy-orfN-bSju-h0By-Upx64T
172.20.0.7 runtime DiscoveredVolume nvme0n4 1 disk 11 GB lvm2-pv E64OOb-lK4c-Dzfi-xiGa-9bNC-vW4T-aeQhL7
I wonder if we could dig more from talosctl support bundle.
Thanks for looking into it some more. I will get a bundle done, and also investigate more on my own. I can definitely repro on that node in my cluster, so hopefully I can root cause.
OK, turns out the root cause was pretty straight forward. The LVM controller can start activating volumes before udevd is running. Those volumes will not have their dev node created because that is performed by udevd in response to LVM activation events.
See the log below. Two activations happen before udevd is running. Those are the 2 volumes missing their node for that particular boot sequence.
192.168.1.13: user: warning: [2024-10-23T21:58:15.064749698Z]: [talos] volume status {"component": "controller-runtime", "controller": "block.VolumeManagerController", "volume": "META", "phase": "waiting -> ready"}
192.168.1.13: user: warning: [2024-10-23T21:58:15.086447698Z]: [talos] task writeIMAPolicy (1/1): done, 166.680011ms
192.168.1.13: user: warning: [2024-10-23T21:58:15.090304698Z]: [talos] phase integrity (2/12): done, 183.238453ms
192.168.1.13: user: warning: [2024-10-23T21:58:15.094078698Z]: [talos] phase etc (3/12): 3 tasks(s)
192.168.1.13: user: warning: [2024-10-23T21:58:15.097845698Z]: [talos] task setUserEnvVars (3/3): starting
192.168.1.13: user: warning: [2024-10-23T21:58:15.101460698Z]: [talos] task setUserEnvVars (3/3): done, 3.630649ms
192.168.1.13: user: warning: [2024-10-23T21:58:15.104960698Z]: [talos] task CreateSystemCgroups (1/3): starting
192.168.1.13: user: warning: [2024-10-23T21:58:15.108488698Z]: [talos] task createOSReleaseFile (2/3): starting
192.168.1.13: user: warning: [2024-10-23T21:58:15.112089698Z]: [talos] task CreateSystemCgroups (1/3): using cgroups root: /
192.168.1.13: user: warning: [2024-10-23T21:58:15.115654698Z]: [talos] task createOSReleaseFile (2/3): done, 14.329314ms
192.168.1.13: user: warning: [2024-10-23T21:58:15.136053698Z]: [talos] activating LVM volume {"component": "controller-runtime", "controller": "block.LVMActivationController", "name": "ceph-c033ff07-8e3d-439b-94bf-2b8f94445df4"}
192.168.1.13: user: warning: [2024-10-23T21:58:15.156689698Z]: [talos] task CreateSystemCgroups (1/3): done, 58.826575ms
192.168.1.13: user: warning: [2024-10-23T21:58:15.160161698Z]: [talos] phase etc (3/12): done, 66.085972ms
192.168.1.13: user: warning: [2024-10-23T21:58:15.163461698Z]: [talos] phase earlyServices (4/12): 4 tasks(s)
192.168.1.13: user: warning: [2024-10-23T21:58:15.166679698Z]: [talos] task startContainerd (4/4): starting
192.168.1.13: user: warning: [2024-10-23T21:58:15.169868698Z]: [talos] service[containerd](Starting): Starting service
192.168.1.13: user: warning: [2024-10-23T21:58:15.172979698Z]: [talos] task startUdevd (1/4): starting
192.168.1.13: user: warning: [2024-10-23T21:58:15.176138698Z]: [talos] task startMachined (2/4): starting
192.168.1.13: user: warning: [2024-10-23T21:58:15.179168698Z]: [talos] task startSyslogd (3/4): starting
192.168.1.13: user: warning: [2024-10-23T21:58:15.182115698Z]: [talos] service[containerd](Preparing): Running pre state
192.168.1.13: user: warning: [2024-10-23T21:58:15.185049698Z]: [talos] volume status {"component": "controller-runtime", "controller": "block.VolumeManagerController", "volume": "STATE", "phase": "waiting -> failed", "error": "volume is encrypted, but no encryption config provided"}
192.168.1.13: user: warning: [2024-10-23T21:58:15.193783698Z]: [talos] service[containerd](Preparing): Creating service runner
192.168.1.13: user: warning: [2024-10-23T21:58:15.196659698Z]: [talos] service[udevd](Starting): Starting service
192.168.1.13: user: warning: [2024-10-23T21:58:15.199467698Z]: [talos] service[udevd](Preparing): Running pre state
192.168.1.13: user: warning: [2024-10-23T21:58:15.202199698Z]: [talos] service[syslogd](Starting): Starting service
192.168.1.13: user: warning: [2024-10-23T21:58:15.204872698Z]: [talos] service[containerd](Running): Process Process(["/bin/containerd" "--address" "/system/run/containerd/containerd.sock" "--state" "/system/run/containerd" "--root" "/system/var/lib/containerd"]) started with PID 1896
192.168.1.13: user: warning: [2024-10-23T21:58:15.212845698Z]: [talos] service[syslogd](Waiting): Waiting for service "machined" to be "up"
192.168.1.13: user: warning: [2024-10-23T21:58:15.215571698Z]: [talos] task startSyslogd (3/4): done, 38.144554ms
192.168.1.13: user: warning: [2024-10-23T21:58:15.218198698Z]: [talos] service[machined](Starting): Starting service
192.168.1.13: user: warning: [2024-10-23T21:58:15.220825698Z]: [talos] service[machined](Preparing): Running pre state
192.168.1.13: user: warning: [2024-10-23T21:58:15.223438698Z]: [talos] service[udevd](Preparing): Creating service runner
192.168.1.13: user: warning: [2024-10-23T21:58:15.226068698Z]: [talos] service[machined](Preparing): Creating service runner
192.168.1.13: user: warning: [2024-10-23T21:58:15.228763698Z]: [talos] service[udevd](Running): Process Process(["/sbin/udevd" "--resolve-names=never"]) started with PID 1902
192.168.1.13: user: warning: [2024-10-23T21:58:15.234298698Z]: [talos] service[machined](Running): Service started as goroutine
192.168.1.13: user: warning: [2024-10-23T21:58:15.296740698Z]: [talos] volume status {"component": "controller-runtime", "controller": "block.VolumeManagerController", "volume": "STATE", "phase": "failed -> failed", "error": "volume is encrypted, but no encryption config provided"}
192.168.1.13: user: warning: [2024-10-23T21:58:15.407836698Z]: [talos] volume status {"component": "controller-runtime", "controller": "block.VolumeManagerController", "volume": "STATE", "phase": "failed -> failed", "error": "volume is encrypted, but no encryption config provided"}
192.168.1.13: user: warning: [2024-10-23T21:58:15.454934698Z]: [talos] activating LVM volume {"component": "controller-runtime", "controller": "block.LVMActivationController", "name": "ceph-cd59b9e6-3943-4927-b7a5-c95d920a35ed"}
192.168.1.13: daemon: info: [2024-10-23T21:58:15.513383698Z]: udevd[1902]: starting version 3.2.14
192.168.1.13: daemon: info: [2024-10-23T21:58:15.520355698Z]: udevd[1902]: starting eudev-3.2.14
Oh, thanks for digging into this, this should be easy to fix!
@jfroy not totally related, but what is the NVMe hardware in your box?
(or talosctl get pcidevices)
I wonder how they come into existence before udevd starts (as it should load the required modules)
@jfroy not totally related, but what is the NVMe hardware in your box?
(or
talosctl get pcidevices)I wonder how they come into existence before
udevdstarts (as it should load the required modules)
Ah, my kernel has most drivers built-in and not as modules. I arguably should modularize it a bit more, but the ordering issue should probably still be fixed.
I checked and indeed your kernel config has CONFIG_BLK_DEV_NVME=m and mine has CONFIG_BLK_DEV_NVME=y.
They are Micron 7300 Pro [8x].
Ok, it explains why I can't reproduce it, but we'll still get a fix for the next 1.8.2, thank you!