bugs
bugs copied to clipboard
first boot after coreos-install fails on veritysetup
Issue Report
Bug
Container Linux Version
1911.3.0
Environment
Baremetal
Actual Behavior
veritysetup fails during boot, inspecting emergency shell shows that e2size /dev/<device>
returns "Success" string, not numerical value. It seems that e2size in initrd is NOT https://github.com/coreos/seismograph/blob/master/src/e2size/e2size.c
Reproduction Steps
- ...
- ...
Other Information
Same deployment procedure works fine in VMWare environment
Thanks for the report. To the best of my knowledge we didn't change anything in this regard recently. What's the exit-code of e2size <device>
? How are you installing and booting this VM? Do you have a known-good version of the same flow?
we provision using iPXE, where network boot runs coreos-install and then reboots. Second boot was exactly what was failing.
I rebooted over iPXE again and it now passes verity-setup. How strange! When it was failing, I was using emergency shell to get some intel, it it was an error from verity-setup's
/sbin/veritysetup create usr --hash-offset="$(e2size /dev/disk/by-partuuid/...) ..."
when I ran e2size /dev/didk/by-partuuid/....
I got result Success
Hm, colleague of mine suggested that this line is a culprit
https://github.com/coreos/seismograph/blob/ba4a441b18d7581514c72f53c9dcf4e90866ddbc/src/e2size/e2size.c#L31
err check was triggered, but errno
wasn't set and strerror
returns Success
Yes, that was also my suspicion. An exit-code of 1 would confirm that. I'm still unsure about the root-cause of this one-off failure though. Was the same disk used for a previous Linux installation without any zero-ing in-between?
An exit-code of 1 would confirm that.
because it is in a shell substitution, that exit code was ignored and it tried to run verity-setup anyway
Was the same disk used for a previous Linux installation without any zero-ing in-between?
doesn't coreos-install script erase whole disk anyway?