kpt
kpt copied to clipboard
Adjust Discovery throttling
I'm using kpt live apply for a bunch of CRDs. However, as the number of CRDs increase, I'm finding it near impossible to use it because I'm getting throttled after every CRD is applied.
Turns out the client is doing a discovery process for every single CRD that's present on the api-server after every new CRD is applied. This is not a problem exclusive to kpt and there are some issues in the kubectl project to address this: https://github.com/kubernetes/kubectl/issues/1126 https://github.com/kubernetes/kubernetes/pull/105520 https://github.com/kubernetes/kubernetes/pull/107131
Disabling it completely is probably not a good idea, but I just wanted to get the ball rolling to address this issue and adjust these parameters.
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
View this failed invocation of the CLA check for more information.
For the most up to date status, view the checks section at the bottom of the pull request.
Thanks @jashandeep-sohi for the PR.
@karlkfi Can you pl. take a look this.
🤔
This should probably just be part of UpdateQPS, so that if server-side throttling is enabled the client-side discovery throttling is disabled.
I had assumed that disabling the QPS would also disable the discovery QPS, but I think the discovery QPS has its own separate default, and that’s why you’re seeing a delay.
How many CRDs are you applying in one package tho?
We might need to build a smarter discovery mapper to really solve this problem. Right now the cache invalidation is all or nothing and preemptively loads all the CRDs after each reset.
thinking
This should probably just be part of UpdateQPS, so that if server-side throttling is enabled the client-side discovery throttling is disabled.
I had assumed that disabling the QPS would also disable the discovery QPS, but I think the discovery QPS has its own separate default, and that’s why you’re seeing a delay.
How many CRDs are you applying in one package tho?
We might need to build a smarter discovery mapper to really solve this problem. Right now the cache invalidation is all or nothing and preemptively loads all the CRDs after each reset.
It's not even that many CRDs at the moment. It's about 26 right now (all the cluster-api ones), but that's certainly expected to increase as we start using more crossplane ones.
I followed your lead, and my understanding is it's not primarily the number of CRDs you're applying that matters, but rather the total number of different API Groups that are already installed on the cluster:
- Apply the CRDs w/ kpt
- On every CRD update this resets the RESTMapper e.g. https://github.com/kubernetes-sigs/cli-utils/blob/0cb95eeda69d254472d3afc81dacfb0d9344fcd0/pkg/kstatus/watcher/object_status_reporter.go#L576-L589
- Which eventually makes its way to https://github.com/kubernetes/client-go/blob/cab7ba1d4a523956b6395dcbe38620159ac43fef/discovery/cached/disk/cached_discovery.go#L266-L274
- And that's basically the equivalent of
rm -rf $HOME/.kube/cache/discovery/$server/*causing the discovery client to repopulate all of the CRD groups installed (in addition to the one that triggered the invalidation). This happens on every CRD apply!
Perhaps the real fix is better cache invalidation. Wouldn't it be enough to just invalidate a single API Group? i.e. the one that's being applied? Maybe I should also open up an issue upstream, but that seems like a much bigger change that'll take some time to make it's way through.
Regardless, I think adjusting these parameters (or providing a way to customize them at runtime) would be helpful.
Yeah, the upstream mapper is the root problem here. It doesn’t provide an API for invalidating a single API group, and it’s too aggressive about pre-loading because it’s designed to assume that a cache miss means it’s not registered on the server. There have been a number of proposals to fix this but none of them have been implemented yet.
On the easier side, we could expose the QPS config to the user. That feels like lazy UX, hoisting a solvable problem onto the user to configure. But maybe that’s preferable to not having a good solution in the short term. I know some Config Sync users that also want to be able to decrease the QPS to reduce server load without needing to configure QoS on every cluster.
I think this is a good idea, client side throttling is just not the right way for the apiserver to protect against overload, and we have priority in fairness in newer kubernetes versions - I believe all OSS supported kubernetes versions.
In kubebuilder-declarative-pattern we are experimenting with an alternative RESTMapper that doesn't fetch things we don't need, I'd also be in favor of reusing that: https://github.com/kubernetes-sigs/kubebuilder-declarative-pattern/tree/master/pkg/restmapper
As you point out @jashandeep-sohi it's hard to get this accepted upstream without demonstrating it first, so the strategy we seem to be converging on is copy-and-pasting the code between the various projects and letting them evolved independently (with occasional sync-ups), and then we can try to get this upstreamed into client-go or wherever it belongs best.
Thanks for the contribution @jashandeep-sohi :-) One problem is that it looks like you haven't signed the project's CLA ( https://github.com/GoogleContainerTools/kpt/pull/3368/checks?check_run_id=7372699029 ) - are you able to sign it?
Actually I don’t think we want to disable QPS here. If you move it down into UpdateQPS tho that’d be ok. Then it only disables on newer clusters with P&F enabled.
we have priority in fairness in newer kubernetes versions - I believe all OSS supported kubernetes versions.
We still support v1.21 where P&F is alpha. GKE stable uses it still. That’s why UpdateQPS does the check to see if it’s enabled before disabling QPS client-side.
Oh, also, I’ve found the P&F docs to be inscrutable. Nobody knows how to configure it and there aren’t very many examples available. I’ve been meaning to interrogate the api machinery team and get some recommendations published but haven’t gotten around to it yet.
@justinsb
Right, forgot about the CLA. Should be signed now.
And thanks for the pointers about getting things merged upstream. I also took a quick crack at improving the cache invalidation in forks:
- https://github.com/kubernetes/apimachinery/compare/v0.24.0...jashandeep-sohi:apimachinery:discovery-cache
- https://github.com/kubernetes/client-go/compare/v0.24.0...jashandeep-sohi:client-go:discovery-cache
- https://github.com/kubernetes-sigs/cli-utils/compare/v0.31.2...jashandeep-sohi:cli-utils:discovery-cache
- https://github.com/jashandeep-sohi/kpt/releases/tag/v1.0.0-beta.17-discovery-cache
It's certainly faster in my not-so-real-world test:
$ kind create cluster --name kpt-test
$ for i in {01..99}; do cat <<EOF; done > crds.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: tests.group${i}.example.com
spec:
group: group${i}.example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
scope: Namespaced
names:
plural: tests
singular: tests
kind: Test
---
EOF
# this also has discovery throttling disabled in addition to the above patches
$ curl -L 'https://github.com/jashandeep-sohi/kpt/releases/download/v1.0.0-beta.17-discovery-cache/kpt_linux_amd64' -o kpt-dc
$ chmod +x kpt-dc
$ kpt-dc live init
$ kpt-dc live install-resource-group
$ time kpt-dc live apply
$ time kpt-dc live destroy
# counting the number of requests per API endpoint
$ kpt-dc live apply -v 6 2>&1 | grep -i 'http' | cut -d ' ' -f 6-9 | sort | uniq -c
$ kpt-dc live destroy -v 6 2>&1 | grep -i 'http' | cut -d ' ' -f 6-9 | sort | uniq -c
Actually I don’t think we want to disable QPS here. If you move it down into UpdateQPS tho that’d be ok. Then it only disables on newer clusters with P&F enabled.
I tried to do that as well, but as far as I can tell https://pkg.go.dev/k8s.io/[email protected]/rest#Config doesn't expose anything related to discovery. It's being handled here https://github.com/kubernetes/cli-runtime/blob/4fdf49ae46a0caa7fafdfe97825c6129d5153f06/pkg/genericclioptions/config_flags.go#L269-L289 (which is also where I presume the values set by the UpdateQPS callback are overwritten).
Ping @karlkfi @justinsb Are we ready to merge this ?
No. It's not safe to disable discoveryQPS and discoveryBurst unless we know server-side throttling is enabled. That's what the code in UpdateQPS is doing.
Unfortunately, it means that lookup logic needs to be pulled out and done earlier, so it can influence QPS in both use cases, because ConfigFlags.toDiscoveryClient() overrides the values in QPS and BurstQPS with what's in discoveryQPS and discoveryBurst.
Looks like this is now causing test failure in TestFnRender and TestFnEval.
Lots of logs like:
W0805 22:39:22.871596 75576 factory.go:71] Failed to query apiserver to check for flow control enablement: making /livez/ping request: dial tcp [::1]:8080: connect: connection refused
It's possible this is now trying to query the server before the connection is configured. I guess because WrapConfigFn used to delay the call until later in the process.
I wonder if WrapConfigFn could be injected with a pointer to the ConfigFlags and modify them when run, instead of just modifying the rest.Config. Would that work, or is that too late to modify the flags struct?
Ok, wrapped everything in a closure and that fixed the tests. And I don't think it's too late because WrapConfigFn is called (in f.ToRestConfig) before Burst and QPS are set.
https://github.com/kubernetes/cli-runtime/blob/4fdf49ae46a0caa7fafdfe97825c6129d5153f06/pkg/genericclioptions/config_flags.go#L269-L289
We've probably want to update kpt-config-sync too. And we JUST migrated the source of truth to GitHub yesterday for that, so we can accept PRs now. Is that something you're interested in doing now that we know the change to make? I think the code is very similar there, if it's not already using this function (which would mean needing a kpt dep bump after the next kpt release).
Looks like CS has its own copy: https://github.com/GoogleContainerTools/kpt-config-sync/blob/main/pkg/client/restconfig/restconfig.go#L64