talos icon indicating copy to clipboard operation
talos copied to clipboard

Embedding maintenance mode config via data partition in .iso

Open stereobutter opened this issue 1 month ago • 11 comments

Feature Request

Have an option in imager to attach configuration and extra data for maintenance mode via an additional filesystem added as partition in the .iso (instead of embedding it in the executable) and have talos maintenance mode discover and mount the filesystem. It would be great if that was a supported way to embed initial configuration.

Description

We are currently using imager to build an .iso per device for bootstrapping that contains device specific configuration (e.g. network settings, talos.config.oauth.client_id etc.). The issue with that is that these device-specific settings are part of the measured boot chain resulting in different PCRs (4, 11 and 12) for each .iso. This makes managing and tracking known good PCRs really hard and also makes it impossible to lock out a specific version of our .iso via the secure boot dbx if that ever became necessary.

From what I read it's possible to do this directly with xorriso; I had success adding an extra partition to an existing .iso simply by padding it, appending an .img containing some filesystem (I used FAT32) and modifying the partition table accordingly. The resulting .iso was still bootable and the OS was able to find and mount the partition. I used ubuntu for my tests since it's easier to debug this way rather than with talos running in maintenance mode.

Example

Imagine this profile.yaml

arch: amd64
platform: metal
output:
  kind: secureboot-iso
...
extraData:  # new feature to create an extra file system and append that to the .iso/disk image as partition
  content: 
   - source: /early-machineconfig  # initial machine config docs placed into the extra file system
     target: /early-machineconfig
   - source: /foo  # directory that includes extra data
     target: /foo  # where to place the dir in the extra filesystem
  signer:  # (optional) key used to sign the file system contents
     awsKMSKeyID: ...
     awsRegion: ...

imager would then:

  • create a signature of the content using the signer config and add the public key and signature to the image for verification at runtime (e.g. via a virtual system extension)
  • create a filesystem from the paths listed in the content section and append that to the iso

talos in maintenance mode would then:

  • discover the extra partition (e.g. via a partition label) and mount the filesystem
  • if the content was signed use the public key and signature to verify the contents
  • apply initial machine documents from some well known location in the filesystem

stereobutter avatar Dec 10 '25 09:12 stereobutter

This goes against the security - in my opinion embedded machine configuration should be measured as well. Also embedding into the UKI means that it works for PXE booting, covering all boot options.

Talos already supports talos.config=metal-iso (https://docs.siderolabs.com/talos/v1.11/reference/kernel#talos-config) which does what you're asking for.

smira avatar Dec 10 '25 09:12 smira

I forgot about talos.config=metal-iso. Will that work for a partial machine config e.g. just network settings via the new multi doc network resources as well?

stereobutter avatar Dec 10 '25 12:12 stereobutter

I forgot about talos.config=metal-iso. Will that work for a partial machine config e.g. just network settings via the new multi doc network resources as well?

yes, it doesn't matter what is the config contents.

It will be "platform" in this scheme: https://docs.siderolabs.com/talos/v1.12/configure-your-talos-cluster/system-configuration/acquire

smira avatar Dec 10 '25 12:12 smira

@smira would there be any chance to add a feature that would enable the user to specify a public key in the kernel arguments, and that would use that public key to verify the config against a signature placed in the same partition before applying it?

mottetm avatar Dec 11 '25 11:12 mottetm

@smira would there be any chance to add a feature that would enable the user to specify a public key in the kernel arguments, and that would use that public key to verify the config against a signature placed in the same partition before applying it?

I think it's same as using SecureBoot and embedding machine config "proper" way in Talos v1.12?

smira avatar Dec 11 '25 12:12 smira

Not quite. The difference is that embedding the config the "proper" way is going to modify the content of the PCRs. Which means that we would have to recompute them for every ISO that we distribute if we want to be able to establish trust using the TPM. On the other hand, with what we propose, the PCRs would remain identical with the ability to still verify the integrity of the embedded configuration.

mottetm avatar Dec 11 '25 12:12 mottetm

I don't like the idea of providing yet another public key just for config verificaiton.

There might be some other way, like signing on the valid PCR values during image creation, or re-using SecureBoot key from the kernel ring to verify machine config.

smira avatar Dec 11 '25 12:12 smira

Maybe a bit of background.

Our idea was to request our OEM to generate key-pairs bound to a specific PCR policy matching our generic ISO and to share with us the public key. This allows us during commissioning to establish trust with the device since it will only have access to the private key if the content of the PCR matches. But for this to work we need to be able to predict the PCR and cannot modify the secureboot partition, the kernel arguments or anything like this to include configuration (network for example). At the same time, we would rather have a way to ensure that the config contained in the ISO is the one we have generated and not one that was tempered with.

mottetm avatar Dec 11 '25 12:12 mottetm

I don't like the idea of providing yet another public key just for config verificaiton.

There might be some other way, like signing on the valid PCR values during image creation, or re-using SecureBoot key from the kernel ring to verify machine config.

The idea would be that the user could supply any public key via e.g. talos.metal-iso.pubkey. It's probably best practice to use a unique signing key for each use case (instead of reusing the signing key for the UKI or the .pcrsig) but a user could if they wanted to. It depends on whether one wants to separate the ability to create a OS release from the ability to sign a machine config (for maintenance mode) for use with a release.

stereobutter avatar Dec 12 '25 08:12 stereobutter

Building on @stereobutter's point about scope, reusing the SecureBoot key also presents implementation challenges.

Looking at the current readConfigFromISO, adding verification via a kernel-arg-provided public key would be straightforward using standard Go crypto. In contrast, extracting the SecureBoot certificate from the kernel's platform keyring requires keyring syscalls, key enumeration, X.509 parsing, and handling for different UEFI implementations — a lot more complexity that could delay the feature if not block it entirely.

We'd be happy to contribute a PR implementing this approach if it can help move things along.

Could you also elaborate on what you had in mind with "signing on valid PCR values during image creation"? We're open to exploring alternatives.

mottetm avatar Dec 12 '25 08:12 mottetm

It just feels that this feature request is more of "I want to solve my own problem". PR is fine, but maintaining and testing this feature would be on us forever after the point it was introduced. So the cost of adding a nice feature is very high.

You might do other ways to achieve similar result, e.g. embed a signature in the comment as the first line of the machine configuration, and use that in the remote attestation. Talos preserves machine configuration as it was submitted, so it's easy to read it back.

smira avatar Dec 12 '25 10:12 smira