mkosi icon indicating copy to clipboard operation
mkosi copied to clipboard

Added support for including UUID in artifact names

Open mcassaniti opened this issue 3 years ago • 22 comments

When performing updates using systemd-sysupdate, it is ideal to provide the partition UUIDs so that systemd-gpt-auto-generator can find the partitions based on either roothash= or usrhash= from the kernel command line. See the examples here from the sysupdate.d documentation.

I'm sure there will be some disagreement over the naming of options, so if someone can come up with something better than VerityUUIDNames I'm all for it.

Currently I've got a script doing this work, but that also requires updating the file names in SHA256SUMS and re-signing.

mcassaniti avatar Aug 19 '22 02:08 mcassaniti

What about putting this information in the image manifest file instead?

DaanDeMeyer avatar Aug 19 '22 09:08 DaanDeMeyer

If that will work with systemd-sysupdate then I'm all for it, but I don't believe it will parse the manifest. We could do both.

mcassaniti avatar Aug 19 '22 10:08 mcassaniti

cc @poettering who knows more about sysupdate than me. I don't like adding options left and right, we already have way too many as is, maybe we can do this by default?

DaanDeMeyer avatar Aug 19 '22 10:08 DaanDeMeyer

I haven't looked at this yet, but I'm not a fan of adding an option for this myself, looking at the general idea, though wouldn't it be an option to add some basic specifier support or sysupdate match option support in reverse (i.e. using the match pattern wildcards from man sysupdate.d as specifiers to expand) to the output filenames?

behrmann avatar Aug 19 '22 10:08 behrmann

... wouldn't it be an option to add some basic specifier support or sysupdate match option support in reverse (i.e. using the match pattern wildcards from man sysupdate.d as specifiers to expand) to the output filenames?

Can you provide an example of what this might look like?

mcassaniti avatar Aug 19 '22 10:08 mcassaniti

Going from your example in the docs artifact_93e00bec-0948-4119-8877-c10c0850617d.verity could this maybe be something like [email protected]?

behrmann avatar Aug 19 '22 11:08 behrmann

The idea is that @u in the configuration is a placeholder for the partition UUID that systemd-sysupdate will in turn set as the partition UUID. I don't see how the partition UUID can be gleaned if the file name is literally [email protected]. The only way I can see another solution working is the addition of another file which stores the partition UUID values or adding them in the manifest, both of which would require extending systemd-sysupdate.

I'm sure it would be nicer for mkosi if systemd-sysupdate supported something else so we didn't have to mangle file names. I'm guessing the systemd folks would rather not add another feature to systemd-sysupdate that does the same job as an existing feature.

Now I could be completely misunderstanding your point. I'm human so it's likely.

mcassaniti avatar Aug 19 '22 11:08 mcassaniti

Maybe I'm misunderstanding what you're doing, as I said, I haven't yet followed the code, just skimmed quickly, but you know the the UUID you want to write into the file (the argument to substitute_uuid), so if the filename would be [email protected] you would replace @u for the UUID passed as an argument.

This would of course need a workaround for people who actually want to put in a literal @u into the name, maybe @@ as syntax for a single @ or maybe mixing this with specifier syntax into %@u for a real Perl-like "I've just been hitting the keyboarv" feeling.

Then again, I might be totally misunderstanding what you want to do. :)

behrmann avatar Aug 19 '22 13:08 behrmann

Now I get it, we could create the output file names based on a template. If we follow the convention from sysupdate then the default template would be something like %i_%v_%u.verity where %i is the image argument and would default to 'image'.

While this sounds like a plan, I can't see a reasonable way to support the current substitutions. Take a look at how many different ways the output file name for an EFI kernel can be generated.

After looking harder, we'd only need the following substitutions to create file name templates with the default template in most cases being %i_%v.%s.%c.

Character Detail
%i Image name. If not defined this defaults to 'image'.
%v Image version. If not specified then substitute an empty string.
%s Default file suffix. This would be raw, qcow2 or verity for example.
%c Compression suffix. If no compression is used then substitute an empty string.
%u Partition UUID. If there is no partition UUID then substitute an empty string.

If this was implemented I'd suggest a regular expression replacement that also conditionally consumed the previous character. It would be something like:

# Extract the previous character from the pattern if any
prev_char = re.search('([-|_|\.])%v', filename)
prev_char = prev_char.groups()[0] if prev_char else ''

# Substitute the version
sub_str = f'{prev_char}{args.version}' if args.version else ''
filename = re.sub('[-|_|\.]%v', sub_str, filename)

Thoughts?

mcassaniti avatar Aug 19 '22 23:08 mcassaniti

I've updated this in preparation for MkosiConfig being a frozen dataclass so that it isn't modified during execution.

mcassaniti avatar Sep 06 '22 07:09 mcassaniti

@poettering I think everyone is waiting on you to way in here. In short there's a need to provide systemd-sysupdate with the UUIDs of partitions when using verity. I'm suggesting optionally putting this in the file names. The other option so far is putting it in the manifest file, but systemd-sysupdate doesn't know how to parse that yet.

mcassaniti avatar Sep 07 '22 08:09 mcassaniti

So the idea of sysupdate was that the file names carry the little metadata we really need, i.e. name, version info and so on. And that includes the partition uuid if files are supposed to be written to a partition in the end.

So yes, it would be excellent if mkosi would take a pattern for naming the output image files, and then automatically fill relevant info in there.

i.e. this is the relevant table:

https://www.freedesktop.org/software/systemd/man/sysupdate.d.html#Match%20Patterns

No all those fields really have anything useful for mkosi to fill data into, but many do, and the uuid is one. Ideally, mkosi would simply use the same way to denote this fields, to minimize confusion.

systemd-sysupdate uses classic UNIX SHA256SUMS files as manifest, i.e. it really only has the filename and a sha sum as metadata for files.

Now, I figure the generate mkosi native manifest files should also carry this info, but I really like the simplicity of just using SHA256SUMS for everything in sysupdate. After all this means there's a very large set of tools already to generate these manifests and use them, and it also pushes things towards a certain level of data frugality: only put the absolute minimum in the file name. the stuff you really need. This is good, because typically contents is better protected than metadata when it comes to cryptographic authentication. i.e. we should minimize the stuff we put in unauthenticated places, because each time when we put stuff there we need to consider if changing this will lead to more than a DoS vulnerability.

Hope this makes sense?

poettering avatar Sep 09 '22 13:09 poettering

And yeah I guess a typical sysupdated based systems would name the files with @v-@u in the file name, the other fields are not relevant or can be set in the sysupdate.d/ drop-in globally.

poettering avatar Sep 09 '22 13:09 poettering

@poettering To me it makes perfect sense and I agree with the security benefits of having a the file name protected. I'm hoping @DaanDeMeyer can way in at this point.

I don't foresee any other details being substituted into the file names in the long run. We could either substitute the UUID information by default when verity is enabled or put it behind an option which is what this PR does.

mcassaniti avatar Sep 09 '22 13:09 mcassaniti

I don't foresee any other details being substituted into the file names in the long run. We could either substitute the UUID information by default when verity is enabled or put it behind an option which is what this PR does.

Hmm, actually including the incompressed file size in the name makes sense too, i.e. @s. This allows systemd-sysupdate to search for a free partition more correctly, since it then knows how much space it will need even if it downloads a compressed file.

so, <name>-@v-@u-@s might actually be what you want for sysupdate.

poettering avatar Sep 09 '22 13:09 poettering

Very insightful and exactly the sort of feedback we need.

mcassaniti avatar Sep 09 '22 13:09 mcassaniti

So, I'd probably go for a solution that allows a pattern to be defined. The thing is you want different naming depending on how you want to use the image. if you intend to drop it in a partition table, you want the uuid and uncompressed size. if you just want to drop it in a dir in the fs, you dont need the uuid, and not really the uncompressed size either but maybe other stuff. So, it's probably a place where you want some flexibility.

poettering avatar Sep 09 '22 15:09 poettering

Regarding name templates:

If this was implemented I'd suggest a regular expression replacement that also conditionally consumed the previous character.

Replacement with regular expressions is quite finicky, I suggest to use string.Template:

import re, string

class NameTemplate(string.Template):
    delimiter = "@"
    idpattern = "(?a:[a-z])" # single ASCII letter
    flags = re.IGNORECASE

part_uuid = "c85858fd-3b1e-42ff-95e6-cb0a26de5d98"
part_size = 2 ** 23

name = NameTemplate("goodlinux_@v_@[email protected]").substitute(v=42, u=part_uuid, s=part_size, r=0)
print(name)

The resulting name is goodlinux_42_c85858fd-3b1e-42ff-95e6-cb0a26de5d98_8388608.usr. Name templates can be configured using --output-split-root and friends (after #1177) is fixed. To implement the name matching I would add the root_hash to MkosiState. With the root hash and the name pattern I can implement getters (or @propertys) for output_split_root etc. in MkosiState. This ensures that only paths where all patterns are correctly substituted are created.

pyfisch avatar Sep 09 '22 15:09 pyfisch

My intention with the regular expression replacement was entirely around consuming the previous character if the value to substitute was not set. This would allow mkosi to ship with default name templates. See below for some examples. With string templates the values can easily be substituted but will leave separator characters in the output string if they are not defined.

If default templates are not important or it doesn't matter if those separator characters remain then even a series of calls to replace() will work just fine.

Pattern Image ID Image Version Suffix UUID File Size Output String
%i_%v.%z image raw image.raw
%i_%v.%z image 001 raw image_001.raw
%i_%v-%u.%z image 001 usr.xz image_001.usr.xz
%i_%v-%u.%z image 001 usr.xz 36ebd1cb-a80c-4a5a-b86a-d30ded6769d5 image_001-36ebd1cb-a80c-4a5a-b86a-d30ded6769d5.usr.xz
%i_%v-%u.%s.%z image 001 usr.xz 36ebd1cb-a80c-4a5a-b86a-d30ded6769d5 image_001-36ebd1cb-a80c-4a5a-b86a-d30ded6769d5.usr.xz
%i_%v-%u.%s.%z image 001 usr.xz 36ebd1cb-a80c-4a5a-b86a-d30ded6769d5 163549802 image_001-36ebd1cb-a80c-4a5a-b86a-d30ded6769d5.163549802.usr.xz

I noticed #1181 by @pyfisch so it's nice to see someone else trying to tackle these issues too.

mcassaniti avatar Sep 10 '22 06:09 mcassaniti

My intention with the regular expression replacement was entirely around consuming the previous character if the value to substitute was not set. This would allow mkosi to ship with default name templates. See below for some examples.

Thanks for explaining it to me. I hadn't thought about default name templates and removing previous characters if a value isn't set. In my design I want to fail if a value isn't set because it catches bugs where some code thinks a value is set but actually isn't. In addition it mitigates bugs where different parts of the code use different paths for the same artifact.

pyfisch avatar Sep 10 '22 12:09 pyfisch

My intention with the regular expression replacement was entirely around consuming the previous character if the value to substitute was not set. This would allow mkosi to ship with default name templates. See below for some examples.

Thanks for explaining it to me. I hadn't thought about default name templates and removing previous characters if a value isn't set. In my design I want to fail if a value isn't set because it catches bugs where some code thinks a value is set but actually isn't. In addition it mitigates bugs where different parts of the code use different paths for the same artifact.

Like all things in life there's always more than one solution. What makes it easier for a user and makes the most sense to support long term? There's always unit tests to (hopefully) catch bugs.

mcassaniti avatar Sep 10 '22 13:09 mcassaniti

Like all things in life there's always more than one solution. What makes it easier for a user and makes the most sense to support long term? There's always unit tests to (hopefully) catch bugs.

Personally I find software easier that raises an error if I made a mistake and does not try to fix it itself. For example if I specify FooOS_%v_%u.kernel and %u isn't defined I prefer an error message over an output file FooOS_42.kernel and debugging my script for an hour to find out that I really can't specify %u for a kernel.

With #1181 mkosi could also ship default name templates, load_args just needs to choose different templates based on the other parameters. Would you mind trying out my PR?

pyfisch avatar Sep 11 '22 08:09 pyfisch

Since we'll support this via systemd-repart as soon as that's implemented, let's close this PR

DaanDeMeyer avatar Oct 27 '22 08:10 DaanDeMeyer