systemd UKI: add support for choosing between multiple device trees

In connection with the discussion in 1 this pull request implements a possible solution where further device trees are appended to the .dtb section and the stub tries to match the FW provided device tree with the ones present in the UKI. It also adds the necessary changes to ukify for it to be able to build such UKIs. The specification change can be found in 2.

Aug 24 '23 17:08 diogo-ivo

also this behaviour needs some documentation somwhere, i.e. sd-stub man page or so

Aug 25 '23 08:08 poettering

I think we need a more generalized approach. Appending like this might work for DTBs, but I don't think it will work for command lines or other stuff that we want to make selectable among multiple options. We should find a way to use separate sections for each payload I think.

@medhefgo does the PE spec/tooling allow multiple sections with the same name?

Aug 29 '23 16:08 bluca

@medhefgo does the PE spec/tooling allow multiple sections with the same name?

Theoretically, yes.

Aug 29 '23 16:08 medhefgo

Is there an advantage to keeping the same section name or would it be possible to name the extra sections .dtb-* as well?

Aug 29 '23 17:08 diogo-ivo

I think using the same name would be easier, as the definition doesn't need to change, the names are fixed and not dynamic, and appending doesn't need to care about current state

Aug 29 '23 17:08 bluca

How big (roughly) is an average dtb and how many would typically be added to an image?

Aug 29 '23 17:08 medhefgo

For the use case that we are targeting it would be something around 5 dtbs, each around 50 to 80 kB

Aug 29 '23 17:08 diogo-ivo

@medhefgo does the PE spec/tooling allow multiple sections with the same name?

I'd prefer not to. the section name is such a nice identifier, let's maintain that as a unique handle. We could just say, that if multiple alternatives of specific sections shall be allowed, we define a clear algorithm how to generate section names from the "primary" section name plus some counter. i.e. take the section name, and suffix it with a decimal counter starting from 2. If this would be longer than 8ch then shorten the name before appending the suffix so that it fits.

Hence, for .initrd the next alternatives would be .initrd2, .initrd3, .initrd4, …, .initr65, … .init113, …

For .cmdline (which already is 8ch long), the alternative names would be .cmdlin2, .cmdlin3, .cmdlin4, …, .cmdli64, …, .cmdl113, …

you get the idea.

Aug 31 '23 10:08 poettering

@medhefgo does the PE spec/tooling allow multiple sections with the same name?

I'd prefer not to. the section name is such a nice identifier, let's maintain that as a unique handle.

That's why I think it's nicer if the name is fixed, so that it's an easy and obvious api, no? It's what we already do for every other setting after all - eg: you add multiple ExecStart=, you don't add ExecStart1= ExecStart2= etc.

We could just say, that if multiple alternatives of specific sections shall be allowed, we define a clear algorithm how to generate section names from the "primary" section name plus some counter. i.e. take the section name, and suffix it with a decimal counter starting from 2. If this would be longer than 8ch then shorten the name before appending the suffix so that it fits.

Hence, for .initrd the next alternatives would be .initrd2, .initrd3, .initrd4, …, .initr65, … .init113, …

For .cmdline (which already is 8ch long), the alternative names would be .cmdlin2, .cmdlin3, .cmdlin4, …, .cmdli64, …, .cmdl113, …

you get the idea.

This seems quite complex and error prone, and not very nice on the eyes. We'd need to start worrying about clashes when there's enough entries and other unpleasantness. Every producer/parser will need to implement it exactly right. Can't say I like that scheme.

Aug 31 '23 11:08 bluca

FWIW I agree with @bluca that keeping the same section name leads to a simpler design. I can implement this and update the pull request

Sep 01 '23 11:09 diogo-ivo

Thank you for the reviews up until this point. Hopefully this addresses the previous comments and fixes the things that were wrong in the previous implementation. To finalize this feature the discussion in 1 should also be addressed.

Sep 14 '23 09:09 diogo-ivo

also needs updates to systemd-measure so that it it can pre-calculate the expected measurements correctly.

Oct 04 '23 11:10 poettering

For systemd-measure I suggest then for it to accept multiple -dtb switches, one for each dtb, whose order in the UKI is defined by sorting the first entry of the compatible field alphabetically, so that there are no ordering issues for computing digests. I'll implement it and extend the specification to accommodate this

Oct 17 '23 11:10 diogo-ivo

Hmmm, I'm still not sure if having multiple sections with the same name is the best thing. It does make the code nice, because we completely avoid the issue of naming, but I'm worried if other tools will support this properly.

Given the limitations in section name sizes, it's the only sensible choice. If other tools choke on something supported by the specification, well, that would be bug in said tools

Oct 17 '23 14:10 bluca

Hmmm, I'm still not sure if having multiple sections with the same name is the best thing. It does make the code nice, because we completely avoid the issue of naming, but I'm worried if other tools will support this properly.

Given the limitations in section name sizes, it's the only sensible choice. If other tools choke on something supported by the specification, well, that would be bug in said tools

Yeah, I guess we should proceed with this approach. If it turns out that there are unexpected issues, we can always try a different approach. At least ukify and objdump seem happy with the multiple sections.

Oct 17 '23 15:10 keszybz

A minor RFE: it would be great if ukify inspect could be extended to print some identifier (I guess the "compatible" string?). Right now if there are multiple sections, it would be quite hard to distinguish them. But this is not a blocker for this, just something that'd be nice to have at some point.

Oct 17 '23 15:10 keszybz

At least ukify and objdump seem happy with the multiple sections.

Well, if you're fine with objcopy not being able to assemble such a file. It throws a very helpful "file in wrong format" error. llvm-objcopy seems to be a bit better at this.

(mkinitcpio and dracut keep insisting on using objcopy + duct tape instead of a proper tool designed for this stuff.)

Oct 17 '23 15:10 medhefgo

At least ukify and objdump seem happy with the multiple sections.

Well, if you're fine with objcopy not being able to assemble such a file. It throws a very helpful "file in wrong format" error. llvm-objcopy seems to be a bit better at this.

(mkinitcpio and dracut keep insisting on using objcopy + duct tape instead of a proper tool designed for this stuff.)

In Ubuntu Core we too tried to use objcopy, and eventually gave to use llvm-objcopy only, it works, and it works on arm64/riscv64, unlike GNU objcopy.

Oct 18 '23 23:10 xnox

@kukrimate @julian-klode we will likely need this in 24.04, UC24, 22.04, UC22 => can you please help review and/or rewrite to get this along. The devicetree compatible match seems ok-ish. I sort of wish to have a dmidecode match as well (like we have in grub with models there).

Oct 18 '23 23:10 xnox

@keszybz @bluca share the concern wrt multiple sections with the same name, there has to be something broken out there that keys a hashtable by section names, or something like that. i've never seen a PE with such section naming in the "wild", and there probably is a reason for that.

can we not number them like .dtb$1, .dtb$2 instead? Or if that is not enough dtbs for everybody, maybe come up with a binary format to put multiple dtbs concatenated into one section?

Oct 19 '23 07:10 kukrimate

Nah that is just messy. Our tools can deal with it just fine, if others are broken just fix them. These are new features so it's fine.

Oct 19 '23 08:10 bluca

@diogo-ivo It'd be great if we could merge this for v255-rc1. Please post an update.

Oct 19 '23 15:10 keszybz

@keszybz Since we are interested in this feature too, if the original author no longer wishes to work on it, I think I can find some time next week to give this a facelift.

Oct 19 '23 20:10 kukrimate

Nah that is just messy. Our tools can deal with it just fine, if others are broken just fix them. These are new features so it's fine.

we must ensure that authenticode calculation is stable, in the face of reordering sections. Or for example, we should restrict the spec that all ".dtb" sections must be one after the other. At this point I am not even sure if this matters, or if this is a red herring.

Oct 20 '23 01:10 xnox

Separately thinking about a UKI which has one optional dtb, to be used when booting on real hardware. But not required at all when booting in qemu vm. I hope we have explicit test for this, of not loading dtb at all if it doesn't match at all. I believe currently we always load dtb if present even if it doesn't match the hw. And this is a behaviour change. Unless we simply require for dtbs to match, or like if one wants to force load the dtb, the dtb compatible must have a flag value for sd-boot to recommend always loading it.

Oct 20 '23 01:10 xnox

Nah that is just messy. Our tools can deal with it just fine, if others are broken just fix them. These are new features so it's fine.

we must ensure that authenticode calculation is stable, in the face of reordering sections. Or for example, we should restrict the spec that all ".dtb" sections must be one after the other. At this point I am not even sure if this matters, or if this is a red herring.

I wouldn't think that was any different than reordering any other section? Ukify processes them in a fixed order. As long as the input is the same, then output should be the same too, unless I'm missing something.

Oct 20 '23 09:10 bluca

maybe it makes sense for ukify to write things out in the "canonical order" of sections which we use to measure them, as defined in the UKI spec. And maybe we should order the dtb sections by their binary contents (and maybe even suppress duplicates?)

That way, the order in which things are specified doesn't matter anymore, we'll build the same PE files no matter what it only matters that you have the same inputs, not in which order you give them to ukify.

Oct 20 '23 10:10 poettering

maybe it makes sense for ukify to write things out in the "canonical order" of sections which we use to measure them, as defined in the UKI spec.

This definitely makes sense

And maybe we should order the dtb sections by their binary contents (and maybe even suppress duplicates?)

This seems more difficult, especially to define in a spec for cross-compatibility? Order of input seems easier to me

Oct 20 '23 10:10 bluca

Thank you for the input, this version should address the comments made above except for the systemd-measure/ordering part, which I'll add when we reach an agreement on if/how we should order the sections

Oct 20 '23 11:10 diogo-ivo

../src/boot/efi/devicetree.c:156:25: error: expected expression
                        size_t len = be32toh(cursor[++i]);
                        ^
../src/boot/efi/devicetree.c:158:57: error: use of undeclared identifier 'len'
                        size_t len_words = DIV_ROUND_UP(len, sizeof(uint32_t));
                                                        ^
../src/boot/efi/devicetree.c:158:57: error: use of undeclared identifier 'len'
../src/boot/efi/devicetree.c:164:81: error: use of undeclared identifier 'len'
                                if (len == 0 || i + len_words > size_words || c[len - 1] != '\0')
                                                                                ^
../src/boot/efi/devicetree.c:164:37: error: use of undeclared identifier 'len'
                                if (len == 0 || i + len_words > size_words || c[len - 1] != '\0')
                                    ^

Oct 20 '23 11:10 bluca