ServerAttestor plugin type
While working through some of the issues at fixing rebootstrapping (https://github.com/spiffe/spire/issues/4624), one standalone piece poped of discussions.
I would like to be able to have custom, ServerAttestor plugins.
There are currently 2 implemented in spire-server itself. Lets call them, fetch_from_file and fetch_from_single_url.
I can think of at least three more basic ones, without thinking a lot about it:
- Give it a list of URLs and require a majority to respond with the same bundle, and use the majority opinion. This would allow (re)bootstrapping from various cloud providers without completely trusting any one provider.
- A break glass one used as a last resort to recover a large number of agents in case of great failure, where it points at an untrusted server (http) and the bundle is signed with a pgp key.
- call to the kubernetes apiserver with kubernetes in cluster credentials to fetch the bundle directly from the apiserver
Some of these could be started out of tree and then merged in as their use is proven reliable/useful.
But it would also provide a base for more complex plugins, such as policy aware plugins and/or aggregate plugins.
For example, when in a spire-ha-agent setup, the trust bundle is synced between independent spire-servers. If side A is still functioning but side B needs to (re)bootstrap, it could use side A's trust bundle record for side B to (re)bootstrap in a completely trusted way.
But, if both side A and slide B have lost trust, then a different mechanism would be needed. so some complex ServerAttestor plugin that can fail between more preferred and less preferred (re)bootstrapping plugins using some policy would be needed.
Keeping this all in a plugin and potentially out of tree while the design is iterated on and proven out would really help progress.
So, alll this to say, I think we want a ServerAttestor plugin type.
Prototype proposal here: https://github.com/spiffe/spire-plugin-sdk/pull/58
Rough idea is, provide the trust domain for which bundle to fetch, along with some metadata about what it might be used for, for more complicated plugins to be able to use.
Thinking about how config could look with this... naming is hard. just some rough ideas...
Before:
agent {
insecure_bootstrap = true
}
Becomes:
plugins {
ServerAttestor "insecure" {}
}
Before:
agent {
trust_bundle_path = "./conf/agent/dummy_root_ca.crt"
}
Becomes:
plugins {
ServerAttestor "path" {
location = "./conf/agent/dummy_root_ca.crt"
}
}
Before:
agent {
trust_bundle_url = "https://localhost:9991"
}
Becomes:
plugins {
ServerAttestor "url" {
location = "https://localhost:9991"
}
}
More options can become available then...
Support more then one source at once (bundles unioned)
plugins {
ServerAttestor "path" {
location = "/var/run/spire/agent/a-trustbundle.crt"
}
ServerAttestor "path" {
location = "/var/run/spire/agent/b-trustbundle.crt"
}
}
Use trust bundle from other spire-agent in a spire-ha-agent setup, if it exists. Give the sysadmin time to put things in place. If not recovered in an hour, fail back to another method.
plugins {
ServerAttestor "spire-ha" {
location = "/var/run/spire/agent/public/api.sock"
}
ServerAttestor "url" {
location = "https://s3...."
validAfter = 1h
}
}
Different sources can have different uses. If no use is specified, its usable for both ["boostrap", "rebootstrap"]
plugins {
ServerAttestor "insecure" {
use = ["bootstrap"]
}
ServerAttestor "url" {
location = "https://s3...."
use = ["rebootstrap"]
}
}
We discussed this during the maintainers meeting last week and during the contributors sync yesterday. One main takeway from that is that most of the use cases here can be handled through an external process that maintains the bundle by updating the file on disk for the "path" variant.
We're also not sure if there's going to be a lot of need for customisation here, so we'd generally prefer to handle this outside of SPIRE without the overhead of the plugins (both for users in terms of configuration options but also for us in terms of maintenance).
If there are cases where it's not possible to use the path to a file on disk or where a plugin could simplify things considerably, we could reconsider this. If you know of any such cases, please let us know.
I have several issues with that approach.
The plugin logic could be moved out to an external process like you suggest, but has some drawbacks:
- It would have a separate lifecycle from the rest of the system. spire plugins fork off with the agent, so are a bit easier to manage that way
- we have plugins for everything else. its odd to use a completely different mechanism for this type of extension
- configuration for it would be completely different/not stored in spire-agent config like everything else
- logging would be somewhere else (cant use spire-agents logger)
- sysadmins would have to come up with monitoring/alerting for the extra process
- cant reuse code as much. if I wrote the quorum url functionality for the agent, then in my own policy server, I'd have to re-implement it there too. Having an official interface for plugins would allow one plugin to reuse another.
- It doesn't work via unix socket so harder to relate the two processes together securely.
- At least for using a file as the interface, there is no way to have additional metadata from where in the attestation/reattestation process it is. For example, is it reattesting or attesting, how long has this attempt been going, how many tries has it made so far, etc. For some of the policy based logic I need to implement, that would be a show stopper. :/
I think issue 8 could be resolved by adding that data as query parameters or http headers to the url based trust bundle fetching code that exists today. Would that be acceptable?
I don't think the url based trust bundle fetcher supports fetching from a unix socket, which is also needed for this case. if it is to be used for local logic to determine how to fetch the trust bundle, it cant have its own https certificate, and http seems less then ideal to support.
So, I think if we enhanced the url based fetcher code to pass metadata and support unix socket, that would be a path forward that would work, and I would be willing to implement things that way. But I think the user experience is not as good as making it plugin like the rest of the plugins, so would prefer to go down that route if the above discussion moves the needle towards making it a proper plugin.
What do you think?
Waiting for #5892 in order to take a decision
Closing this for now. The http unix socket approach is working well enough for now. We can revisit this again in the future if so desired.