spire
spire copied to clipboard
spire-{agent,server}: allow more gRPC connection configuration
The current spire-agent resolver and load balancer are configured to be DNS with round robin load balancing. gRPC implements that round robin as connecting to each address that the resolver presents and then each RPC is round robined across the connections. Additionally spire-server forces disconnects every 3 minutes to force clients to resolve DNS, in case there is a change in the list of servers. This works ok in most situations and provides a good behaviour out of the box, but it would be useful for some deployments to be able to configure this. For example:
- using an xDS resolver to have better control over resolution and being able to dynamically push updates. Can also be used to route a specific node to the right “downstream” spire-server group, based on the node id it sends in the xDS requests.
- A pick_first load balancing option would allow similar configurations to the above, but with less flexibility to the administrator.
This is somewhat similar to what was requested in https://github.com/spiffe/spire/issues/4696, but goes a bit further. It should also include being able to tune the server side of things. Currently it forces reconnections every 3 minute to force DNS resolution. This leads to spire-servers having to do periodic TLS handshakes with agents, even though there's nothing usually requiring it (e.g. DNS updates). With something like xDS it is no longer required since it can push updates to spire-agent, so it would be good to not have to do a bunch of TLS handshakes for nothing.