talos
talos copied to clipboard
[bug] Talos configuration will apply the disks section before all devices are ready
Bug Report
Description
The Talos configuration for machine.disks
gets applied on startup without ensuring all disks on the system are ready and available. Doing it beforehand can lead to the machine failing to configure, triggering it to reboot and end up in a boot loop.
In my case, I had attached a 60 bay JBOD to a node. On a regular boot, it saw all the disks just fine, but it was a slow initialization as it enumerated over them. Once I tried to configure the disks as mounts within Talos, the machine started panicing and went into a reboot loop.
I have another box with 12 drives that I was able to configure and mount just fine. They were using 3 different HBAs and 6 separate channels (2 channels per card). The 60 bay JBOD was connected all over a single channel to a single HBA.
Logs
Mentioned in Slack: https://taloscommunity.slack.com/archives/CMARMBC4E/p1711685935164829
Environment
- Talos version: 1.6.7
- Kubernetes version: 1.29.2
- Platform: metal / amd64