talos
talos copied to clipboard
MachineConfig.disks nofail and other options
Currently Talos reboots system if machine.disks.* is not found.
- It will be grate skip that error, it helps to fix disk problems or any issues.
- second disk can be hdd/ssd/nvmv and better to set mount options like
noatime,discard
machineconfig proposal
machine:
disks:
- device: /dev/sdb
nofail: true (default false)
partitions:
- mountpoint: /var/mnt/extra
options: (default not exist)
- noatime
- discard
- nofail
I wonder if failures like this one should actually pause the boot sequence (to avoid further damage, e.g. by writing to the mountpoint vs. the mounted disk), but keep apid running so that the issue can be fixed or mitigated?
The main goal of it, to continue run node. After success up and running, prometheus/hw raid exporter gather more details and send it to operator.
apid does not have any metrics/daemons to collect such information.
This is a tough one. I don't have any ideas. What should we do if the disk isn't there?
Notes from planning meeting
I think it makes sense to split the issue:
- mount options which are passed down to the
mount()syscall (likenoatime,discard) are a great idea, and we should definitely implement that - options which change the semantics or behavior of Talos (like
nofail) are tricky: if the mount operation fails and gets skipped, this makes the mountpoint empty and refer to a different disk which might cause other cascading failure (e.g./var/mnt/extrawas storing database data directory). Instead of addingnofailwe would rather look towards changing Talos behavior on failures to pause the boot process and leaveapidrunning for the operator to make a decision: change the machine configuration to remove the failed mount or perform other recovery procedures.
See also #4669
stop booting is not a good idea, it cannot fix the problem and required the human to fix non critical issue.
When you set nofail you usually know what you do. This is not default mount flag. And many linux users use this flag every day...
So as an other option, Talos can set taints NoExecute to the node. Which can allow to run only node critical pods.
Kubelet flag --register-with-taints. This helps automation system to fix the problem (if it possible of course)
Booting up even to kubelet might lead to catastrophic failures, which we'd rather avoid. Operator can remove the mount if it's really not necessary, but common case might be that this option opens a door towards critical operational mistakes.
I found a more pressing issue for this feature: inode32. We are running dind for our build agents and unfortunately we get EOVERFLOWs in docker volumes when they are located on our 4Tb drive, while they work fine from the 400gb drive. The solution for that appears to add the inode32 mount option, even though it comes with its own downsides. The issue is documented here: Red-hat article in "Inode numbers".
Remounting the volume with mount -o remount,inode32 /host/var/mnt/extra temporarily solves this issue for us until the next reboot.
@smira I am currently facing a real problem with the lack of other options. I really need inode32 for my mounts to run our build jobs, which have some 32 bit tools. They fail in odd ways and can partially corrupt state because of it. Is there a way to ensure that I always have inode32 other than running a script in a loop in a privileged container?
We don't have a solution for this issue at the moment, the #8367 is supposed to solve it, but it will come no earlier than Talos 1.8.
The issue is pretty clear, and Talos support is not great in this area at the moment.