apiserver-network-proxy Support for multiple server addresses on agent

The current way of implementing HA setups is bit cumbersome in many cases. When running multiple servers we need to configure each server with proper --server-count but the agent can be configured only with one --proxy-server-host address. Essentially this requires one to have a LB of sorts in front of the servers. While this is not really an issue on cloud envs with ELBs and such at disposal, it's a real burden in bare metal and similar environments.

What if we could configure agent with multiple addresses in --proxy-server-host (or a new flag)? In such case the agent could "just" take connections to all provided servers and thus achieve the same as for getting --server-count number of unique connections via the LB. The big pro (IMO) in this case is that it's pretty simple to re-configure the agent (in k0s case it's running as DaemonSet) based on e.g. watching some service endpoints.

WDYT?

There's couple somewhat related issues for better support for dynamic server counts worth of referencing: https://github.com/kubernetes-sigs/apiserver-network-proxy/issues/358 https://github.com/kubernetes-sigs/apiserver-network-proxy/issues/273

Aug 30 '22 12:08 jnummelin

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Nov 28 '22 12:11 k8s-triage-robot

/remove-lifecycle stale

Dec 05 '22 15:12 twz123

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Mar 05 '23 16:03 k8s-triage-robot

/remove-lifecycle stale

Mar 05 '23 22:03 cheftako

From another user "Perhaps konnectivity-agent should have an alternate flag --proxy-server-service-name, and it takes a value of a Kubernetes Service and looks at the underlying Endpoints object to find out the specific IP addresses it should connect to. Then it can be sure it is opening connections directly to each replica, without going through the LB."

Another example "Setting the kubernetes service name directly for --proxy-server-host would be a nice improvement and I have also wondered similarly if that is possible."

Mar 05 '23 22:03 cheftako

Would it make sense if setting multiple servers simultaneously to assume there is now LB involved and ignore any indication from the server on the number of connections to attempt?

Alternately we could have an explicit flag to turn off LB retry logic. For not if multiple endpoints are configured but the LB retry is not turned off we could error out. This would more easily allow us to support this case in the future once we properly understood what the retry/reconnect logic should be. (Eg. If the requested connection count > configured hosts? do we randomly pick an address to connect on? Do we round robin? Do attempt to keep the count per host even?)

Mar 05 '23 22:03 cheftako

Would it make sense if setting multiple servers simultaneously to assume there is now LB involved and ignore any indication from the server on the number of connections to attempt?

IMO that sounds good. In our use case it would be much easier to just configure the agent with all server addresses.

Perhaps konnectivity-agent should have an alternate flag --proxy-server-service-name, and it takes a value of a Kubernetes Service and looks at the underlying Endpoints object to find out the specific IP addresses it should connect to.

This sounds pretty good to me. In many cases, at least all cases in the world of k0s, the server is sitting next to API server and thus we'd be able to use the kubernetes svc endpoints pretty much directly.

Another example "Setting the kubernetes service name directly for --proxy-server-host would be a nice improvement and I have also wondered similarly if that is possible."

This would AFAIK have the downside that you could not run the agent in host network as it cannot resolve the svc names.

May 31 '23 08:05 jnummelin

/lifecycle frozen

Oct 10 '23 07:10 liangyuanpeng

@liangyuanpeng Why frozen?

@cheftako Has there been any discussions on this in the SIG group? If any of the alternatives proposed sounds feasible, someone could have a go at the implementation.

Jan 30 '24 21:01 jnummelin

/assign @cheftako

Apr 22 '24 17:04 jkh52