apiserver-network-proxy
apiserver-network-proxy copied to clipboard
Extend agent to proxy KAS traffic generated on node network
This PR implements #177.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: irozzo-1A
To complete the pull request process, please assign anfernee after the PR has been reviewed.
You can assign the PR to them by writing /assign @anfernee
in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Hi @irozzo-1A. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
this is a great feature that enable bi-directional communication between cloud and edge
Not seeing any server code changes or grpc proto changes. Assuming this is not the complete work?
Not seeing any server code changes or grpc proto changes. Assuming this is not the complete work?
@cheftako Indeed this is not the complete work, this implements only the agent changes only. The server part is supposed to be implemented with a dedicated PR (see the aggregator PR and this thread on slack for more context).
I tried to implement this without the need for any grpc proto change, in order to distinguish cluster -> master
connections from master -> cluster
connections I adopted the convention of using even ConnectID
for the first case and even for the latter (see https://github.com/kubernetes-sigs/apiserver-network-proxy/pull/176/files#diff-c2eff45b2c7db0ae864aecb2e8b733dfbdfe7070ffe6f7feb1204e086c403a0aR523-R525). The server code where the ConnectID
is generated will also need to be adapted accordingly.
If you see some problems with this approach, I did not so far but I could have missed something, the alternative would be to add something to identify the initiator (i.e. agent or server) besides the ConnectID
in the proto (it should be optional for backward compatibility)
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
:memo: Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.
It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
- If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
- If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
- If you have done the above and are still having issues with the CLA being reported as unsigned, please log a ticket with the Linux Foundation Helpdesk: https://support.linuxfoundation.org/
- Should you encounter any issues with the Linux Foundation Helpdesk, send a message to the backup e-mail support address at: [email protected]
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
@irozzo-1A: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test
message.
In response to this:
/test cla/linuxfoundation
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Great feature!
/ok-to-test
This looks like a nice feature, any progress?
/cc @cheftako @andrewsykim @anfernee
Hello! what is the current state of this PR? @jnummelin and me at k0sproject.io used this PR as a branch for our feature branch and would be nice to sync up the changes we introduced to enable handling worker to kas traffic.
Hi @soider, @PratikDeoghare prepared the server-side (PR is coming soon) and I'm preparing a manual e2e test to verify that everything works as expected. I should have something by the end of the week. After that, I will kindly ask @cheftako to do another pass :-)
@irozzo-1A @PratikDeoghare have you had it already implemented? we have some changes in our fork as well (also, no PR, but the current source is on the https://github.com/soider/apiserver-network-proxy/tree/node-to-kas-proxying) to support server side handling of the traffic.
@irozzo-1A @PratikDeoghare have you had it already implemented? we have some changes in our fork as well (also, no PR, but the current source is on the https://github.com/soider/apiserver-network-proxy/tree/node-to-kas-proxying) to support server side handling of the traffic.
Yes, this branch drafts the complete feature: https://github.com/irozzo-1A/apiserver-network-proxy/tree/extend-proxy But as I mentioned we need to test it, polish it, and create some PRs out of this.
You can see the diff here: https://github.com/irozzo-1A/apiserver-network-proxy/pull/3
/test pull-apiserver-network-proxy-test
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
@irozzo-1A we are interested in getting this merged, where does this stand?
Hi @jkh52, this PR is ready for review. When this is merged we can proceed with the remaining part.
/remove-lifecycle stale
Hey @cheftako, would it be possible to have another pass here?
@irozzo-1A: PR needs rebase.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Does this PR work as is? I don't see any server-side changes, and the server doesn't currently handle DIAL_REQ
packets from the backend.
I'd like to propose an alternative approach: rather than changing the protocols to support dialing in both directions, what if you just added an agent on the control-plane node alongside the apiserver, and simply used the konnectivity-client to connect from the node side? A couple details:
- This might require additional backend selection logic to select the right agent backend for each direction, I'm not sure if the DestHost or DefaultRoute backend managers can handle this today.
- For the client connection, this could be implemented as a separate proxy component, or packaged into the agent binary. I'd suggest implementing it as a library, and starting with a separate command. Once it's proven out with that approach, we could consider building it into the agent. This will also help ensure that it's loosely coupled to the existing agent logic.
Does this PR work as is? I don't see any server-side changes, and the server doesn't currently handle
DIAL_REQ
packets from the backend.
Indeed, this only handles the Agent part, the Server part is almost ready though, it was waiting for this PR to enter. See: https://github.com/kubernetes-sigs/apiserver-network-proxy/pull/176#issuecomment-976640650
I'd like to propose an alternative approach: rather than changing the protocols to support dialing in both directions, what if you just added an agent on the control-plane node alongside the apiserver, and simply used the konnectivity-client to connect from the node side? A couple details:
- This might require additional backend selection logic to select the right agent backend for each direction, I'm not sure if the DestHost or DefaultRoute backend managers can handle this today.
- For the client connection, this could be implemented as a separate proxy component, or packaged into the agent binary. I'd suggest implementing it as a library, and starting with a separate command. Once it's proven out with that approach, we could consider building it into the agent. This will also help ensure that it's loosely coupled to the existing agent logic.
I presume that the spirit of your proposal is doing something less intrusive or risky, I would like to point out that:
- The code path to handle the nodes->control_plane direction is protected by a feature gate.
- The messages are unchanged. The only protocol change is that I used the connection ID parity to determine the direction. This is not clean and it also means that server is no backward compatible when the feature is activated, but the problem can be solved by introducing an optional direction flag on the data messages. I was waiting for a feedback on this point. This to say that IMO the risk of regression is low and the clean-up should be fairly easy in case the feature is dismissed.
That said, I think the proposal is interesting, I see the following downsides though:
- Increased operational complexity, we need to deploy the proxies plus the control-plane agents with a specific configuration.
- We should add filtering logic to make sure that who has access to the proxy, does not gain full access to control plain node net
- We should IMO make sure that the default selection logic excludes somehow the control plane agents, otherwise the risk of failure due to misconfiguration is high.
- Increased resource consumption, it would double the TCP connections between nodes and server in the worst case.
Also, I presume if we decide to go the way you proposed, we should update the KEP, and going through another validation, right?
I presume that the spirit of your proposal is doing something less intrusive or risky
It's not only about reducing risk, it's also about minimizing complexity. It looks like your changes reimplement some of the agent functionality in the server, and vice-versa. Even if the changes are behind a feature gate, the additional code is still a maintenance burden.
I think my proposal reduces the amount of code by reusing existing mechanisms, but it also keeps the architecture conceptually simpler. There is only one connection type, dials flow in one direction, and we don't need to branch data packet handling based on connection ID divisibility.
Increased operational complexity, we need to deploy the proxies plus the control-plane agents with a specific configuration.
Yes, but that also comes with more flexibility. For example if your proxy-server is on a different node than the apiserver.
We should add filtering logic to make sure that who has access to the proxy, does not gain full access to control plain node net
Isn't this also true with your current approach? It looks like the server gets the dial address from the DialReq?
We should IMO make sure that the default selection logic excludes somehow the control plane agents, otherwise the risk of failure due to misconfiguration is high.
Yes, I agree this is the biggest downside, but this is also aligned with other changes that we want to make to have smarter backend selection. I think this should work with DestHost
selection as long as you don't have overlapping IPs in your control plane & cluster networks (or maybe you can just use localhost). Another option would be to make it configurable by source, so that agents get backends selected from a different pool. I'm not sure exactly what that would look like.
A benefit to this approach is this added complexity is siloed off to the backend manager component, and doesn't affect the main request code paths.
Increased resource consumption, it would double the TCP connections between nodes and server in the worst case.
The proxy-server to control-plane agent connection is already multiplexed, so that connection overhead should be minimal. It does add more frontend connections since it's not reusing the backend connection from the worker nodes, but we want to make that multiplexed anyway.
As for next steps, I'll talk this over with some of the other konnectivity maintainers.
Thanks @tallclair, let me share my final thoughts while waiting for an update about the next steps.
I think my proposal reduces the amount of code by reusing existing mechanisms, but it also keeps the architecture conceptually simpler. There is only one connection type, dials flow in one direction, and we don't need to branch data packet handling based on connection ID divisibility.
Probably you're right but it's not completely obvious to me, IMO it largely depends on the changes required by the agent-selection logic. Also, the proxy will look like a hybrid between the current server and the agent, it's possible to factorize code to limit duplications, but the same could be done with the current approach. The problem of determining the direction is definitely one of the bigger downsides, IMO having an optional direction flag would be already a bit better.
Yes, but that also comes with more flexibility. For example if your proxy-server is on a different node than the apiserver.
The only advantage I see would be being able to scale differently server and control-plane agents, but I don't know how interesting this would be.
We should add filtering logic to make sure that who has access to the proxy, does not gain full access to control plain node net
Isn't this also true with your current approach? It looks like the server gets the dial address from the DialReq?
It's totally true, the main difference is that this filtering logic was supposed to be on the server, which is the component that is supposed to be on the control-plane side. Anyway, I think it would not be a bad thing to be able to constraint possible destinations in the nodes network as well.
We should IMO make sure that the default selection logic excludes somehow the control plane agents, otherwise the risk of failure due to misconfiguration is high.
Yes, I agree this is the biggest downside, but this is also aligned with other changes that we want to make to have smarter backend selection. I think this should work with
DestHost
selection as long as you don't have overlapping IPs in your control plane & cluster networks (or maybe you can just use localhost). Another option would be to make it configurable by source, so that agents get backends selected from a different pool. I'm not sure exactly what that would look like.
I think that using DestHost
would imply that the node network proxies know the hosts where the control-plane agents reside, which may vary, and anyway it represents a configuration burden. I would see better a pool selection mechanism.
Also, it may be useful to avoid control-plane agents getting selected when using current strategies, because:
- we may have clashes with
DestHost
as you mentioned - default strategy basically would be broken This would require something to signal that the agent is on control-plane network and some special handling of this case.
A benefit to this approach is this added complexity is siloed off to the backend manager component, and doesn't affect the main request code paths.
Increased resource consumption, it would double the TCP connections between nodes and server in the worst case.
The proxy-server to control-plane agent connection is already multiplexed, so that connection overhead should be minimal. It does add more frontend connections since it's not reusing the backend connection from the worker nodes, but we want to make that multiplexed anyway.
Yeah, I don't know how big the problem would be. In a 5K nodes cluster with an agent per node, we would pass from 5K to 10K persistent connections to the server. But, probably in a 5K cluster, it would not be recommended to have 1 agent per node. Currently, the way traffic is vehiculated from the nodes network to the control plane network assumes that there is 1 agent per node, but decupling agents from node network proxies would reduce the problem to the proxies only.
I spoke with a few of the konnectivity maintainers today, and they agreed they're interested in further exploring this path, and also that that means a KEP update. Please tag me on the PR if/when you're able to get to that.
Also, the proxy will look like a hybrid between the current server and the agent, it's possible to factorize code to limit duplications, but the same could be done with the current approach.
I think we can just reuse the konnectivity-client, so the cluster network listener looks pretty simple. This is oversimplified, but here's approximately what your handleConnection
function could become:
import (
// ...
kclient "sigs.k8s.io/apiserver-network-proxy/konnectivity-client/pkg/client"
)
func (a *Client) handleConnection(protocol, address string, conn net.Conn) error {
// ...
opts := []grpc.DialOption{ // copied from egress_selector.go
grpc.WithBlock(),
grpc.WithReturnConnectionError(),
grpc.WithTimeout(30 * time.Second), // matches http.DefaultTransport dial timeout
}
tunnel, err := kclient.CreateSingleUseGrpcTunnel(context.TODO(), serverAddress, opts...)
if err != nil {
return err
}
serverConn, err := tunnel.DialContext(context.TODO(), "tcp", address)
if err != nil {
return err
}
go io.Copy(clientConn, serverConn) // TODO: handle error
go io.Copy(serverConn, clientConn)
return nil
}
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale
- Close this PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle rotten
- Close this PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the PR is closed
You can:
- Reopen this PR with
/reopen
- Mark this PR as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closed this PR.
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied- After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied- After 30d of inactivity since
lifecycle/rotten
was applied, the PR is closedYou can:
- Reopen this PR with
/reopen
- Mark this PR as fresh with
/remove-lifecycle rotten
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.