smee
smee copied to clipboard
iPXE fails to boot when using the latest LetsEncrypt trust chains
Any HTTPS server using LetsEncrypt might suddenly stop working with our iPXE binaries after the next certificate refresh.
Expected Behaviour
iPXE boots normally
Current Behaviour
Invalid argument (http://ipxe.org/1c0de802)
Context
- @grahamc's boot server refreshed its LE certificate and iPXE booting in @equinixmetal datacenters started failing.
- https://community.letsencrypt.org/t/production-chain-changes/150739
- ipxe/ipxe#116
With NixOS 21.05 you can work around this by preferring their shorter chain:
security.acme = {
acceptTerms = true;
email = "[email protected]";
- certs."${domain}".keyType = "rsa4096";
+ certs."${domain}" = {
+ keyType = "rsa4096";
+ extraLegoRunFlags = [
+ # re: https://community.letsencrypt.org/t/production-chain-changes/150739/1
+ # re: https://github.com/ipxe/ipxe/pull/116
+ # re: https://github.com/ipxe/ipxe/pull/112
+ # re: https://lists.ipxe.org/pipermail/ipxe-devel/2020-May/007042.html
+ "--preferred-chain" "ISRG Root X1"
+ ];
+ };
};
I noticed that https://github.com/ipxe/ipxe/pull/116 was merged. I believe this should be fixable by simply upgrading which version of ipxe we use.
I noticed that ipxe/ipxe#116 was merged. I believe this should be fixable by simply upgrading which version of ipxe we use.
Uh, am I missing something? It doesn't look merged to me...
Should we just apply the patch ourselves? How would you like to proceed @nshalman?
I'm not sure. I think we've worked around this in our production environment by preferring the alternate chain. I think we can wait-and-see a little longer in case upstream finally merges that change and then we can just advance to the latest.
Also, given #213 and a desire to re-work how we handle our iPXE builds, perhaps we should defer any changes for this to when we have a broken-out iPXE build...
I spent some time looking at the current situation for this issue, looking especially for projects that have successsfully worked around this, projects which are carrying the PR 116 patch, and organizations that are blocked for whatever reason.
workarounds
- Flatcar uses the "ISRG Root X1" chain, per https://github.com/flatcar-linux/Flatcar/issues/527
- NixOS uses the "ISRG Root X1" chain, as above. https://github.com/tinkerbell/boots/issues/166#issuecomment-862704961
carrying the patch
- Arch Linux distro incorporated the TLS Fragmentation patch, noted at https://github.com/archlinux/svntogit-community/commit/3daaf0532e3457222997fbdfeb37fbd12c5bf237
- netboot.xyz has a testing fork for the PR 116 patch, at https://github.com/netbootxyz/ipxe/tree/testing_upstream_pr116 noted at https://github.com/netbootxyz/netboot.xyz/pull/920 - unclear to me if this is in production
blocked
- Harvester has an open issue at https://github.com/harvester/harvester/issues/2226 and documentation with some workarounds at https://github.com/harvester/ipxe-examples/tree/main/equinix
- XCP-ng (verbal report, no public issue as of yet)