spire icon indicating copy to clipboard operation
spire copied to clipboard

x509pop server plugin support for servers trust bundle

Open kfox1111 opened this issue 1 year ago • 9 comments

Enables the x509pop node attestor server plugin to be configured to use the SPIRE Servers own trust bundle.

kfox1111 avatar Oct 13 '24 01:10 kfox1111

Using the bundle, without any sort of restriction, seems scary. Doesn't this imply that any workload can just turn around and attest as a node?

Yeah, that I think can be handled like:

    NodeAttestor "x509pop" {
      plugin_data {
        spire_trust_bundle = true
        # Only allow spiffe identifiers of the form spiffe://<trustDomain>/k8s-spire-agent-helper/<hostname>
        # Generate identifiers that look like spiffe://<trustDomain>/x509pop/k8s-spire-agent/<hostname>
        agent_path_template = "{{ ${d}p := printf \"spiffe://%s/k8s-spire-agent-helper/\" .TrustBundle }}{{ ${d}s := printf \"%s\" (index .Certificate.URIs 0) }}{{ if gt (len ${d}s) (len ${d}p) }}{{ ${d}ps := slice ${d}s 0 (len ${d}p)
 }}{{ if eq ${d}ps ${d}p }}{{ printf \"/x509pop/k8s-spire-agent/%s\" (slice ${d}s (len ${d}p)) }}{{ end }}{{ end }}"
      }
    }

if it agent_path renders out to "", then it should be disallowed.

I did hit:

  • https://github.com/spiffe/spire/issues/5574
  • https://github.com/spiffe/spire/issues/5573

Trying to get it to work though.

kfox1111 avatar Oct 14 '24 18:10 kfox1111

Hey @kfox1111, do you have an arch diagram or something that spells out the use case clearly that you can share (either here or privately in slack with the maintainer group)? Before we consider taking this we'd like to see if there are alternatives.

azdagron avatar Oct 17 '24 18:10 azdagron

Here's the general idea. Got the host -> k8s part working. Still working on the k8s -> vm part.

With this configuration, the spire-agent on the host managed by systemd is the first to start, and establishes the trust chain used by all other parts of the node. The trust never leaves the node. The SPIRE server then really is the bottom turtle in this setup. 🐢

x509pop

kfox1111 avatar Oct 18 '24 14:10 kfox1111

Related to: https://github.com/spiffe/spire/issues/5206

kfox1111 avatar Oct 18 '24 14:10 kfox1111

Used with the sprig pr, this works:

    agent_path_template = "{{ $$p := printf \"spiffe://%s/k8s-spire-agent-helper/\" .TrustDomain }}{{ $$s := printf \"%s\" (index .Certificate.URIs 0) }}{{ if hasPrefix $$p $$s }}{{ printf \"/x509pop/k8s-spire-agent/%s\" (trimPrefix $$p $$s) }}{{ else }}{{ fail \"Invalid prefix\" }}{{ end }}"

kfox1111 avatar Oct 19 '24 11:10 kfox1111

Updated diagram

kfox1111 avatar Oct 19 '24 15:10 kfox1111

Here's the general idea. Got the host -> k8s part working. Still working on the k8s -> vm part.

With this configuration, the spire-agent on the host managed by systemd is the first to start, and establishes the trust chain used by all other parts of the node. The trust never leaves the node. The SPIRE server then really is the bottom turtle in this setup. 🐢

x509pop

I've had some thoughts about a similar (or maybe the same, I have yet to think it through fully) architecture, chaining trust all the way to the hardware (which would likely be tpms too). Just a +1 that this kind of thing is something that other users might want to do.

My thinking here is that for something like this we might want a separate node attestor (maybe x509svidpop?) because the differences between x509 and x509-SVID matter here. For example you might want to limit which SPIFFE-IDs you allow in some way (e.g. a list of SPIFFE-IDs or at least some kind of validation of the SPIFFE-ID format) as well as extracting some information for the SPIFFE ID to make available to the template.

Another thing that would be useful to me would be to allow having different trust domains, e.g. the physical nodes using TPMs could be a different trust domain than the ones on k8s. In large corporations it's possible that different groups manage those different infrastructures so they may be different trust domains.

sorindumitru avatar Oct 23 '24 10:10 sorindumitru

@sorindumitru I think we're pretty much on the same page.

We could make a new x509svidpop plugin but it would end up being almost the same code as the x509pop one. (The patch here is pretty small). So a lot more stuff to maintain.

the svid use case really isn't that different from the normal x509pop one. Either way you probably want some way to filter out which certs are allowed to a subset of all possible ones. That can be done via agent_path_template as done in this pr without any special code in the plugin.

100% agree on the future ability to use some other spire trustDomain too. Was going to propose that in a different patch. The primary use case for me being:

k8s multitenant cluster. Secured with spire. Call this, the resource provider cluster.

Tenant uses resource provider cluster with kubevirt to launch a bunch of vm's for their tenant. Inside, they launch their own spire server, and workloads (maybe even their own k8s cluster in the vms). It would greatly simplify management if they could use x509 certs from the resource provider cluster to node attest to their own spire-server.

That again, would basically be this same x509pop plugin as described here, but getting its trustBundle from disk and keep it refreshed (existing functionality extended to refresh from disk), or get it over vsock (not sure which is best yet)

kfox1111 avatar Oct 23 '24 15:10 kfox1111

I agree it's going to be very similar to x509pop, but dealing with that is an implementation detail, I'm sure some parts can be shared. I just don't think that the template language is a good way of dealing with those differences.

sorindumitru avatar Oct 24 '24 05:10 sorindumitru