apiserver-network-proxy Extend agent to proxy KAS traffic generated on node network

This PR implements #177.

Mar 01 '21 15:03 irozzo-1A

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: irozzo-1A To complete the pull request process, please assign anfernee after the PR has been reviewed. You can assign the PR to them by writing /assign @anfernee in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

Mar 01 '21 15:03 k8s-ci-robot

Hi @irozzo-1A. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Mar 01 '21 15:03 k8s-ci-robot

this is a great feature that enable bi-directional communication between cloud and edge

Mar 17 '21 01:03 huxiaoliang

Not seeing any server code changes or grpc proto changes. Assuming this is not the complete work?

Aug 12 '21 23:08 cheftako

Not seeing any server code changes or grpc proto changes. Assuming this is not the complete work?

@cheftako Indeed this is not the complete work, this implements only the agent changes only. The server part is supposed to be implemented with a dedicated PR (see the aggregator PR and this thread on slack for more context).

I tried to implement this without the need for any grpc proto change, in order to distinguish cluster -> master connections from master -> cluster connections I adopted the convention of using even ConnectID for the first case and even for the latter (see https://github.com/kubernetes-sigs/apiserver-network-proxy/pull/176/files#diff-c2eff45b2c7db0ae864aecb2e8b733dfbdfe7070ffe6f7feb1204e086c403a0aR523-R525). The server code where the ConnectID is generated will also need to be adapted accordingly. If you see some problems with this approach, I did not so far but I could have missed something, the alternative would be to add something to identify the initiator (i.e. agent or server) besides the ConnectID in the proto (it should be optional for backward compatibility)

Aug 16 '21 10:08 irozzo-1A

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

:memo: Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
If you have done the above and are still having issues with the CLA being reported as unsigned, please log a ticket with the Linux Foundation Helpdesk: https://support.linuxfoundation.org/
Should you encounter any issues with the Linux Foundation Helpdesk, send a message to the backup e-mail support address at: [email protected]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sep 06 '21 20:09 k8s-ci-robot

@irozzo-1A: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test cla/linuxfoundation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sep 06 '21 22:09 k8s-ci-robot

Great feature!

/ok-to-test

Sep 29 '21 04:09 SataQiu

This looks like a nice feature, any progress?

/cc @cheftako @andrewsykim @anfernee

Oct 08 '21 03:10 SataQiu

Hello! what is the current state of this PR? @jnummelin and me at k0sproject.io used this PR as a branch for our feature branch and would be nice to sync up the changes we introduced to enable handling worker to kas traffic.

Nov 23 '21 13:11 mikhail-sakhnov

Hi @soider, @PratikDeoghare prepared the server-side (PR is coming soon) and I'm preparing a manual e2e test to verify that everything works as expected. I should have something by the end of the week. After that, I will kindly ask @cheftako to do another pass :-)

Nov 23 '21 13:11 irozzo-1A

@irozzo-1A @PratikDeoghare have you had it already implemented? we have some changes in our fork as well (also, no PR, but the current source is on the https://github.com/soider/apiserver-network-proxy/tree/node-to-kas-proxying) to support server side handling of the traffic.

Nov 23 '21 14:11 mikhail-sakhnov

@irozzo-1A @PratikDeoghare have you had it already implemented? we have some changes in our fork as well (also, no PR, but the current source is on the https://github.com/soider/apiserver-network-proxy/tree/node-to-kas-proxying) to support server side handling of the traffic.

Yes, this branch drafts the complete feature: https://github.com/irozzo-1A/apiserver-network-proxy/tree/extend-proxy But as I mentioned we need to test it, polish it, and create some PRs out of this.

You can see the diff here: https://github.com/irozzo-1A/apiserver-network-proxy/pull/3

Nov 23 '21 14:11 irozzo-1A

/test pull-apiserver-network-proxy-test

Dec 03 '21 20:12 irozzo-1A

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

May 13 '22 17:05 k8s-triage-robot

@irozzo-1A we are interested in getting this merged, where does this stand?

May 13 '22 20:05 jkh52

Hi @jkh52, this PR is ready for review. When this is merged we can proceed with the remaining part.

May 14 '22 10:05 irozzo-1A

/remove-lifecycle stale

May 14 '22 10:05 irozzo-1A

Hey @cheftako, would it be possible to have another pass here?

Jun 03 '22 17:06 irozzo-1A

@irozzo-1A: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Aug 17 '22 12:08 k8s-ci-robot

Does this PR work as is? I don't see any server-side changes, and the server doesn't currently handle DIAL_REQ packets from the backend.

I'd like to propose an alternative approach: rather than changing the protocols to support dialing in both directions, what if you just added an agent on the control-plane node alongside the apiserver, and simply used the konnectivity-client to connect from the node side? A couple details:

This might require additional backend selection logic to select the right agent backend for each direction, I'm not sure if the DestHost or DefaultRoute backend managers can handle this today.
For the client connection, this could be implemented as a separate proxy component, or packaged into the agent binary. I'd suggest implementing it as a library, and starting with a separate command. Once it's proven out with that approach, we could consider building it into the agent. This will also help ensure that it's loosely coupled to the existing agent logic.

Nov 09 '22 21:11 tallclair

Does this PR work as is? I don't see any server-side changes, and the server doesn't currently handle DIAL_REQ packets from the backend.

Indeed, this only handles the Agent part, the Server part is almost ready though, it was waiting for this PR to enter. See: https://github.com/kubernetes-sigs/apiserver-network-proxy/pull/176#issuecomment-976640650

I'd like to propose an alternative approach: rather than changing the protocols to support dialing in both directions, what if you just added an agent on the control-plane node alongside the apiserver, and simply used the konnectivity-client to connect from the node side? A couple details:

This might require additional backend selection logic to select the right agent backend for each direction, I'm not sure if the DestHost or DefaultRoute backend managers can handle this today.

For the client connection, this could be implemented as a separate proxy component, or packaged into the agent binary. I'd suggest implementing it as a library, and starting with a separate command. Once it's proven out with that approach, we could consider building it into the agent. This will also help ensure that it's loosely coupled to the existing agent logic.

I presume that the spirit of your proposal is doing something less intrusive or risky, I would like to point out that:

The code path to handle the nodes->control_plane direction is protected by a feature gate.
The messages are unchanged. The only protocol change is that I used the connection ID parity to determine the direction. This is not clean and it also means that server is no backward compatible when the feature is activated, but the problem can be solved by introducing an optional direction flag on the data messages. I was waiting for a feedback on this point. This to say that IMO the risk of regression is low and the clean-up should be fairly easy in case the feature is dismissed.

That said, I think the proposal is interesting, I see the following downsides though:

Increased operational complexity, we need to deploy the proxies plus the control-plane agents with a specific configuration.
We should add filtering logic to make sure that who has access to the proxy, does not gain full access to control plain node net
We should IMO make sure that the default selection logic excludes somehow the control plane agents, otherwise the risk of failure due to misconfiguration is high.
Increased resource consumption, it would double the TCP connections between nodes and server in the worst case.

Also, I presume if we decide to go the way you proposed, we should update the KEP, and going through another validation, right?

Nov 11 '22 11:11 irozzo-1A

I presume that the spirit of your proposal is doing something less intrusive or risky

It's not only about reducing risk, it's also about minimizing complexity. It looks like your changes reimplement some of the agent functionality in the server, and vice-versa. Even if the changes are behind a feature gate, the additional code is still a maintenance burden.

I think my proposal reduces the amount of code by reusing existing mechanisms, but it also keeps the architecture conceptually simpler. There is only one connection type, dials flow in one direction, and we don't need to branch data packet handling based on connection ID divisibility.

Increased operational complexity, we need to deploy the proxies plus the control-plane agents with a specific configuration.

Yes, but that also comes with more flexibility. For example if your proxy-server is on a different node than the apiserver.

We should add filtering logic to make sure that who has access to the proxy, does not gain full access to control plain node net

Isn't this also true with your current approach? It looks like the server gets the dial address from the DialReq?

We should IMO make sure that the default selection logic excludes somehow the control plane agents, otherwise the risk of failure due to misconfiguration is high.

Yes, I agree this is the biggest downside, but this is also aligned with other changes that we want to make to have smarter backend selection. I think this should work with DestHost selection as long as you don't have overlapping IPs in your control plane & cluster networks (or maybe you can just use localhost). Another option would be to make it configurable by source, so that agents get backends selected from a different pool. I'm not sure exactly what that would look like.

A benefit to this approach is this added complexity is siloed off to the backend manager component, and doesn't affect the main request code paths.

Increased resource consumption, it would double the TCP connections between nodes and server in the worst case.

The proxy-server to control-plane agent connection is already multiplexed, so that connection overhead should be minimal. It does add more frontend connections since it's not reusing the backend connection from the worker nodes, but we want to make that multiplexed anyway.

As for next steps, I'll talk this over with some of the other konnectivity maintainers.

Nov 11 '22 18:11 tallclair

Thanks @tallclair, let me share my final thoughts while waiting for an update about the next steps.

I think my proposal reduces the amount of code by reusing existing mechanisms, but it also keeps the architecture conceptually simpler. There is only one connection type, dials flow in one direction, and we don't need to branch data packet handling based on connection ID divisibility.

Probably you're right but it's not completely obvious to me, IMO it largely depends on the changes required by the agent-selection logic. Also, the proxy will look like a hybrid between the current server and the agent, it's possible to factorize code to limit duplications, but the same could be done with the current approach. The problem of determining the direction is definitely one of the bigger downsides, IMO having an optional direction flag would be already a bit better.

Yes, but that also comes with more flexibility. For example if your proxy-server is on a different node than the apiserver.

The only advantage I see would be being able to scale differently server and control-plane agents, but I don't know how interesting this would be.

We should add filtering logic to make sure that who has access to the proxy, does not gain full access to control plain node net

Isn't this also true with your current approach? It looks like the server gets the dial address from the DialReq?

It's totally true, the main difference is that this filtering logic was supposed to be on the server, which is the component that is supposed to be on the control-plane side. Anyway, I think it would not be a bad thing to be able to constraint possible destinations in the nodes network as well.

We should IMO make sure that the default selection logic excludes somehow the control plane agents, otherwise the risk of failure due to misconfiguration is high.

Yes, I agree this is the biggest downside, but this is also aligned with other changes that we want to make to have smarter backend selection. I think this should work with DestHost selection as long as you don't have overlapping IPs in your control plane & cluster networks (or maybe you can just use localhost). Another option would be to make it configurable by source, so that agents get backends selected from a different pool. I'm not sure exactly what that would look like.

I think that using DestHost would imply that the node network proxies know the hosts where the control-plane agents reside, which may vary, and anyway it represents a configuration burden. I would see better a pool selection mechanism. Also, it may be useful to avoid control-plane agents getting selected when using current strategies, because:

we may have clashes with DestHost as you mentioned
default strategy basically would be broken This would require something to signal that the agent is on control-plane network and some special handling of this case.

A benefit to this approach is this added complexity is siloed off to the backend manager component, and doesn't affect the main request code paths.

Increased resource consumption, it would double the TCP connections between nodes and server in the worst case.

The proxy-server to control-plane agent connection is already multiplexed, so that connection overhead should be minimal. It does add more frontend connections since it's not reusing the backend connection from the worker nodes, but we want to make that multiplexed anyway.

Yeah, I don't know how big the problem would be. In a 5K nodes cluster with an agent per node, we would pass from 5K to 10K persistent connections to the server. But, probably in a 5K cluster, it would not be recommended to have 1 agent per node. Currently, the way traffic is vehiculated from the nodes network to the control plane network assumes that there is 1 agent per node, but decupling agents from node network proxies would reduce the problem to the proxies only.

Nov 14 '22 10:11 irozzo-1A

I spoke with a few of the konnectivity maintainers today, and they agreed they're interested in further exploring this path, and also that that means a KEP update. Please tag me on the PR if/when you're able to get to that.

Also, the proxy will look like a hybrid between the current server and the agent, it's possible to factorize code to limit duplications, but the same could be done with the current approach.

I think we can just reuse the konnectivity-client, so the cluster network listener looks pretty simple. This is oversimplified, but here's approximately what your handleConnection function could become:

import (
	// ...
	kclient "sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client"
)

func (a *Client) handleConnection(protocol, address string, conn net.Conn) error {
	// ...

	opts := []grpc.DialOption{ // copied from egress_selector.go
		grpc.WithBlock(),
		grpc.WithReturnConnectionError(),
		grpc.WithTimeout(30 * time.Second), // matches http.DefaultTransport dial timeout
	}
	tunnel, err := kclient.CreateSingleUseGrpcTunnel(context.TODO(), serverAddress, opts...)
	if err != nil {
		return err
	}
	serverConn, err := tunnel.DialContext(context.TODO(), "tcp", address)
	if err != nil {
		return err
	}

	go io.Copy(clientConn, serverConn) // TODO: handle error
	go io.Copy(serverConn, clientConn)

	return nil
}

Nov 14 '22 22:11 tallclair

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Feb 12 '23 23:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle rotten
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Mar 15 '23 00:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen
Mark this PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Apr 14 '23 00:04 k8s-triage-robot

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen

Mark this PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Apr 14 '23 00:04 k8s-ci-robot

apiserver-network-proxy apiserver-network-proxy copied to clipboard

Extend agent to proxy KAS traffic generated on node network

apiserver-network-proxy
apiserver-network-proxy copied to clipboard