talos
talos copied to clipboard
Error upgrading from 1.9.5 to 1.9.6
Bug Report
I'm running into an issue during upgrade causing the following error failed to probe bootloader on upgrade: file does not exist
Description
See my current configurations here: https://github.com/jseely/talos-config The machine in question is Talos2
Logs
user: warning: [2025-05-10T21:26:24.651751283Z]: [talos] task unmountSystemDiskBindMounts (1/1): starting
user: warning: [2025-05-10T21:26:24.651839283Z]: [talos] task unmountSystemDiskBindMounts (1/1): unmounting /system/state
kern: notice: [2025-05-10T21:26:24.651999283Z]: XFS (sdb5): Unmounting Filesystem 42fdfd20-7af3-4a1c-a83e-639df8846e8c
user: warning: [2025-05-10T21:26:24.674184283Z]: [talos] task unmountSystemDiskBindMounts (1/1): unmounting /var
kern: notice: [2025-05-10T21:26:24.840656283Z]: XFS (sdb6): Unmounting Filesystem 6c3a0e60-1a9c-4e27-9e4c-44c7f3b6adb4
user: warning: [2025-05-10T21:26:24.865315283Z]: [talos] task unmountSystemDiskBindMounts (1/1): done, 213.561982ms
user: warning: [2025-05-10T21:26:24.865346283Z]: [talos] phase unmountBind (7/14): done, 213.618541ms
user: warning: [2025-05-10T21:26:24.865358283Z]: [talos] phase unmountSystem (8/14): 2 tasks(s)
user: warning: [2025-05-10T21:26:24.865377283Z]: [talos] task unmountStatePartition (2/2): starting
user: warning: [2025-05-10T21:26:24.865415283Z]: [talos] task unmountEphemeralPartition (1/2): starting
user: warning: [2025-05-10T21:26:24.865543283Z]: [talos] task unmountStatePartition (2/2): done, 165.911µs
user: warning: [2025-05-10T21:26:24.865575283Z]: [talos] task unmountEphemeralPartition (1/2): done, 169.595µs
user: warning: [2025-05-10T21:26:24.865601283Z]: [talos] phase unmountSystem (8/14): done, 244.182µs
user: warning: [2025-05-10T21:26:24.865612283Z]: [talos] phase volumeFinalize (9/14): 1 tasks(s)
user: warning: [2025-05-10T21:26:24.865631283Z]: [talos] task teardownLifecycle (1/1): starting
user: warning: [2025-05-10T21:26:24.865977283Z]: [talos] volume status {"component": "controller-runtime", "controller": "block.VolumeManagerController", "volume": "STATE", "phase": "ready -> closed", "location": "/dev/sdb5", "parentLocation": "/dev/sdb"}
user: warning: [2025-05-10T21:26:24.866006283Z]: [talos] volume status {"component": "controller-runtime", "controller": "block.VolumeManagerController", "volume": "EPHEMERAL", "phase": "ready -> closed", "location": "/dev/sdb6", "parentLocation": "/dev/sdb"}
user: warning: [2025-05-10T21:26:24.866029283Z]: [talos] volume status {"component": "controller-runtime", "controller": "block.VolumeManagerController", "volume": "META", "phase": "ready -> closed", "location": "/dev/sdb4", "parentLocation": "/dev/sdb"}
user: warning: [2025-05-10T21:26:24.866164283Z]: [talos] task teardownLifecycle (1/1): done, 531.599µs
user: warning: [2025-05-10T21:26:24.866185283Z]: [talos] phase volumeFinalize (9/14): done, 573.766µs
user: warning: [2025-05-10T21:26:24.866195283Z]: [talos] phase upgrade (10/14): 1 tasks(s)
user: warning: [2025-05-10T21:26:24.866212283Z]: [talos] task upgrade (1/1): starting
user: warning: [2025-05-10T21:26:24.866232283Z]: [talos] task upgrade (1/1): performing upgrade via "factory.talos.dev/installer/60b42e4f2f1eaee545c2436154a21f67ad285e596c106a1fb8f827954a8ed391:v1.9.6"
user: warning: [2025-05-10T21:26:24.957519283Z]: 2025/05/10 21:26:28 running Talos installer v1.9.6
user: warning: [2025-05-10T21:26:24.961308283Z]: 2025/05/10 21:26:28 system disk wipe on upgrade is not supported anymore, option ignored
user: warning: [2025-05-10T21:26:24.963150283Z]: 2025/05/10 21:26:28 running pre-flight checks
user: warning: [2025-05-10T21:26:24.964415283Z]: 2025/05/10 21:26:28 host Talos version: v1.9.5
user: warning: [2025-05-10T21:26:24.966966283Z]: 2025/05/10 21:26:28 host Kubernetes versions: kubelet: 1.32.3, kube-apiserver: 1.32.3, kube-scheduler: 1.32.3, kube-controller-manager: 1.32.3
user: warning: [2025-05-10T21:26:24.966977283Z]: 2025/05/10 21:26:28 all pre-flight checks successful
user: warning: [2025-05-10T21:26:24.989441283Z]: Error: failed to probe bootloader on upgrade: file does not exist
Environment
- Talos version:
Client:
Tag: v1.9.5
SHA: undefined
Built: 2025-03-12T13:12:47Z
Go version: go1.24.1
OS/Arch: linux/amd64
Server:
NODE: 10.0.144.106
Tag: v1.9.5
SHA: d07f6daa
Built:
Go version: go1.23.7
OS/Arch: linux/amd64
Enabled: RBAC
- Kubernetes version: 1.32.3
- Platform: Bare metal
Looks like Talos fails to find the bootloader, you should be using GRUB unless this is a Secure Boot system.
Can you please post output of talosctl get dv for that machine?
➜ ~ talosctl get dv -n 10.0.144.106
NODE NAMESPACE TYPE ID VERSION TYPE SIZE DISCOVERED LABEL PARTITIONLABEL
10.0.144.106 runtime DiscoveredVolume dm-0 1 disk 1.2 TB luks
10.0.144.106 runtime DiscoveredVolume dm-1 1 disk 1.2 TB luks
10.0.144.106 runtime DiscoveredVolume loop3 1 disk 74 MB squashfs
10.0.144.106 runtime DiscoveredVolume sda 1 disk 62 GB iso9660 TALOS_V1_9_5
10.0.144.106 runtime DiscoveredVolume sdb 1 disk 299 GB gpt
10.0.144.106 runtime DiscoveredVolume sdb1 1 partition 105 MB vfat EFI
10.0.144.106 runtime DiscoveredVolume sdb2 1 partition 1.0 MB BIOS
10.0.144.106 runtime DiscoveredVolume sdb3 1 partition 1.0 GB BOOT
10.0.144.106 runtime DiscoveredVolume sdb4 1 partition 1.0 MB META
10.0.144.106 runtime DiscoveredVolume sdb5 1 partition 105 MB xfs STATE STATE
10.0.144.106 runtime DiscoveredVolume sdb6 1 partition 298 GB xfs EPHEMERAL EPHEMERAL
10.0.144.106 runtime DiscoveredVolume sdc 1 disk 1.2 TB lvm2-pv 4538c6-IEUU-lrn2-VOSJ-cXP0-s0RD-e9x0Ok
10.0.144.106 runtime DiscoveredVolume sdd 1 disk 1.2 TB lvm2-pv mb1MFv-4ZLU-pY55-ttsM-muyk-vH5Z-do325z
Could it be mistakenly unmounting the wrong disk?
What is strange is that BOOT partition is not detected as xfs (while it should be) with the default Talos install
Yeah it looks like the other node with similar hardware picks up the filesystem type of the boot partition properly. I'll try a fresh install on that node and see if it fixes it.
➜ ~ talosctl get dv -n 10.0.144.105
NODE NAMESPACE TYPE ID VERSION TYPE SIZE DISCOVERED LABEL PARTITIONLABEL
10.0.144.105 runtime DiscoveredVolume dm-0 1 disk 1.2 TB
10.0.144.105 runtime DiscoveredVolume dm-1 1 disk 1.2 TB
10.0.144.105 runtime DiscoveredVolume loop3 1 disk 74 MB squashfs
10.0.144.105 runtime DiscoveredVolume sdb 1 disk 299 GB gpt
10.0.144.105 runtime DiscoveredVolume sdb1 1 partition 105 MB vfat EFI EFI
10.0.144.105 runtime DiscoveredVolume sdb2 1 partition 1.0 MB BIOS
10.0.144.105 runtime DiscoveredVolume sdb3 1 partition 1.0 GB xfs BOOT BOOT
10.0.144.105 runtime DiscoveredVolume sdb4 1 partition 1.0 MB talosmeta META
10.0.144.105 runtime DiscoveredVolume sdb5 1 partition 105 MB xfs STATE STATE
10.0.144.105 runtime DiscoveredVolume sdb6 1 partition 298 GB xfs EPHEMERAL EPHEMERAL
10.0.144.105 runtime DiscoveredVolume sdc 1 disk 299 GB
10.0.144.105 runtime DiscoveredVolume sdd 1 disk 1.2 TB lvm2-pv eQcIiv-xZT3-aRub-gJmI-HWCq-p2he-BrHP2F
10.0.144.105 runtime DiscoveredVolume sde 1 disk 1.2 TB lvm2-pv BFOZAj-wXSA-vmze-89LK-RK2d-wCwc-eNL5Tl
10.0.144.105 runtime DiscoveredVolume sdf 1 disk 1.2 TB
This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.