[FEATURE] spire server endpoint listing recommended spire-agent version
Since spire-agent requires compatibility within a specific dot version of spire-server, it would be nice if spire-server exposed an unauthenticated API endpoint that returns the highest recommended client version.
This endpoint must be unauthenticated because a compatible agent may not yet be installed.
This would enable automated spire-agent updates by:
- Querying the spire-server for the recommended agent version
- Installing or upgrading to that specific version via local tooling
For containerized deployments, the version string returned by the API should match the corresponding container image tag to ensure seamless automation.
In this way I could essentially do a trivial variable substitution to make sure my spire-agent is on a version known to work with the spire-server I've got deployed.
Some of my edge nodes tend to be offline for highly extended periods and may find themselves left with an unusable agent.
To be clear, this is not a request for the spire server to provide the actual client binaries.
Thanks for opening this issue @jcpunk. We discussed this a bit and here are some notes:
- The highest recommended version for a spire-server version is the version of spire-server, based on our upgrade docs.
- This can get complicated in HA setups where you will have multiple spire-server instances and may have different versions during upgrades.
Based on this we'd propose to have an unauthenticated API that exposes the version of spire-server that you are connecting to. Dealing with HA setups is up to the caller, for example by resolving the spire-server endpoint and calling each individual instance and figuring out the minimum version.
Would that work for you?
I'm concerned that needing to check the upgrade docs would insert a human into what I was hopeful to automate.
In my ideal world there is some sort automated way for my various edge systems to notice they are running the wrong version of the agent and correct themselves. Some edge nodes are managed by other teams and getting tooling to automate agent management would make server updates less painful on the clients - allowing them to just consume spire without worrying about version skew.
I suppose a high level of my deployment plans might assist:
- drop a container onto the host which runs the spire-agent
- a nightly job runs to see what version of the agent each node should run based on the spire server(s)
- if the version is different from what is running on the system, change the container and restart it
For the HA world, I'm sure I can build tooling to query my various servers, but, without a machine parse able version of compatible agents, human interaction is a weak point.
The spire-server version would give you the most recent agent version you can upgrade to. Your nightly job would basically check if the running agent is on that version and if not, will do the upgrade. If there are multiple spire-server instances it would use the smallest version of the reported ones for that check.
I don't think we'd have a good way of reporting all compatible versions. For example at the time a spire-server version was released all the agent versions that it will be compatible with might not exist. For example yesterday we released 1.12.6 which is compatible with 1.13.0 which was released ~2 months ago. 1.13.0 has no way of knowing that a 1.12.6 exist, but they can work together.
You can also consider building a small server exposing this information. It has the benefit that you have more control over the upgrade policies. e.g. you might determine based on the calling ip address that only 10% of the agents should upgrade for the first week of a release.
As long as it’s guaranteed that all X.Y.* (1.12., 1.13., etc.) server versions are compatible with matching X.Y clients, that approach should work. I’ll just need to determine how to identify the third version component. Currently, there’s no spire-agent:X.Y tag that points to the latest X.Y release.
My reading of the upgrade doc assumes, but does not dictate this. I could be missing something there.
The guarantee we provide is stated on the upgrade page:
SPIRE Agents must not be newer than the oldest SPIRE Server that they communicate with, and may be up to one minor version older.
The must not be newer is the part that matters since it gives us the highest version that the agent can run.
The version the server would expose through it's API is the full version (major.minor.patch) so you'd have all 3 components returned from it. So it sounds like you'll be able to use that directly to determine the image.
@jcpunk, are you ok with this proposal to add a version endpoint?
With those version compat rules in place, I should be fine.