Make sure os boot entry is not at the top of boot order
Pull Request Checklist
- [ ] implement the feature
- [ ] write the documentation
- [ ] extend the test coverage
- [ ] update the specification
- [ ] adjust plugin docstring
- [ ] modify the json schema
- [ ] mention the version
- [ ] include a release note
See https://github.com/teemtee/tmt/pull/3645, seems like you and @martinhoyer are dancing around the same problem.
See #3645, seems like you and @martinhoyer are dancing around the same problem.
hmmm, Martin's mr is trying to fix my issue? While this mr is trying implement your thought in the slack channel " but: isn't this patch something that we should eventually add to tmt-reboot?"
See #3645, seems like you and @martinhoyer are dancing around the same problem.
hmmm, Martin's mr is trying to fix my issue? While this mr is trying implement your thought in the slack channel " but: isn't this patch something that we should eventually add to tmt-reboot?"
TBH, I'm not sure. I just saw both of you poking the same issue and the same part of the code. It made sense to me to get you two together so you could sort out what's the best approach.
I just saw both of you poking the same issue and the same part of the code
hmmm, np, IIUC , we are not poking the same issue, the key part of Martin's mr is efibootmgr -n (setting next boot) to make sure efi system boot correctly to the os,while mine is efibootmgr -o , to make sue the os boot entry not at the top of order. And looks like we are not poking the same code, either though I didn't get the whole picture, but it seems that he is trying to add a new script, while mine is make tmt-reboot work more robust :) However, I may be wrong, Martin, feel free to ping me to talk about this :smiley:
@happz @thrix @psss Miro mentioned that there are some complaints in beaker-admin list about the boot order issue, and there are still several days left before the end of this month, I decided to look into this today,so..here are the updates:
- xiaofeng didn't see the problem again after he has his changing boot order mr merged 2)qinghua saw the problem, because she use her team's code which has bootc install to-existing-root , but don't have similar move bootc boot option to the end of the list code(ie, my mr)
- the complaints from the beaker-admin is likely because some other teams who have bootc install to-existing-root in their code, but don't have similar code in this mr.
- Both qinghua and xiaofeng's team use tmt-reboot, which I guess also by other teams who are using/will use bootc install
Maybe I'm missing something, but wouldn't https://github.com/teemtee/tmt/pull/3645 be enough for everyone?
np, your mr will help those who want to change boot next, while mine is for those who want to make sure os entry is not on the top of the order list
In conclusion, both Martin and this mr are needed. Just as Milos said,We need to fix the bootc on top of the boot order list issue from tmt side, if not, other teams will run into similar problem.Ah, maybe some of you feel confused about what the problem is, telling from the channel messages. The problem is with bootc install to-existing-root, bootupd added a new boot options and configured as first boot, so after the job is finished, the following jobs will not be able to provision the server again, unless change the boot order, and make pxe as first order manually The reason why users don't see this on so called package mode installation, is because in that way ,beaker will add a changing- boot-order code in each job's kickstart %post part
For Martin's concern "This seems specific to bare metal beaker-like workflow." , I could have this mr changes boot order only if /root/EFI_BOOT_ENTRY.TXT(yeah,beaker stuff) exists,if you folks want:)
The problem is with bootc install to-existing-root, bootupd added a new boot options and configured as first boot, so after the job is finished, the following jobs will not be able to provision the server again, unless change the boot order, and make pxe as first order manually
But why would this be solved in tmt-reboot?
This will ultimately be yet another wonky workaround for beaker being dumb and obsolete. In any case, if it is being done, it should be set after first boot imho, preferably through something like kickstart post-install scripts and let tmt-reboot set next-boot to the booted entry as the least-worst workaround.
Just back from vacation.
But why would this be solved in tmt-reboot?
Because users will call tmt-reboot every time after "bootc install to-existing-root", which is the cause of the boot order issue. FYI,AFAIK,we only have two users,whose jobs caused the boot order issue,both their ( xiaofeng and qinghua) issues are fixed after the code are added
This will ultimately be yet another wonky workaround for beaker being dumb and obsolete.
I'm sorry, but I don't see how it could be wonky,tbo,I don't see how we could do it in a more proper way :)
In any case, if it is being done, it should be set after first boot imho,
You mean first boot after bootudp added a new boot entry,right? I'm sorry but I don't see how it would be better.
let tmt-reboot set next-boot to the booted entry as the least-worst workaround.
tmt-reboot is already doing it.
Summary, as requested:
-
There is an assumption that the provisioned beaker machine always have network boot as 1st boot entry to prevent system not being available for provisioning.
-
This is why tmt-reboot, and potentially tmt-ensure-efiboot-order script change the "next" boot entry, so the run can continue even after boot/crash.
-
This assumption can be broken, be it bootc related or otherwise.
The proposed fix is to always set network boot as 1st boot entry when running tmt-reboot if a /root/EFI_BOOT_ENTRY.TXT file exists, suggesting this is a beaker-provisioned machine.
My issues with it:
- Not all beaker machines are set to be always re-provisioned and tmt has no way to know what is the expected boot order.
- example scenario: I have a dedicated bare metal machine, which is not being re-provisioned, but instead
tmt provision -h connectis being used to run tests on it. Using this workaround would change the boot order unexpectedly, at least requiring manual intervention, setting the boot order through system mgmt/console or in worst case, leading to data loss do to beaker re-provisining the system.
- example scenario: I have a dedicated bare metal machine, which is not being re-provisioned, but instead
- This should be solved where it's getting broken and tmt should work with the above assumption.
- If we really want to have this in tmt, it should be done as part of the provisioning code, not hard-coding boot order on every tmt-reboot execution.
That's about from me on this PR. With all the wasted time on this and things like this, we could have beaker replaced with a suitable modern platform ten times already. :)
And here is my feedback summary,as requested.
With all the wasted time on this
Yeah, I do feel I have wasted way more time than I expected
example scenario: I have a dedicated bare metal machine, which is not being re-provisioned, but instead tmt provision -h connect is being used to run tests on it.
I do overlooked the connect scenario, if I noted it or this info is provided earlier,I would already close this mr :neutral_face: ,now I agree with you : tmt-reboot is not the best place to put the code, and maybe we could move this mr's changing boot order code into your mr, and broadcast users that if their test code are likely to change boot order, then they need to run your script , or maybe just broadcast users they need to pay more attention if their code is going to change boot-order,in anyway, my mr is not needed anymore, feel free to close it.
If we really want to have this in tmt, it should be done as part of the provisioning code, not hard-coding boot order on every tmt-reboot execution.
On xiaofeng and qinghua's scenario, the bootc boot entry dose not exist on provision process, so tmt is not able to fix the issue in that way, however, I now agree that tmt should not hard-coding boot order on tmt-reboot:)
Is this still relevant after 32bac2ecb6566cfc5f9d024cec2a147e67fe8761?
Is this still relevant after 32bac2e?
closing as not planed, based on https://github.com/teemtee/tmt/pull/3665#issuecomment-3105596500