cli icon indicating copy to clipboard operation
cli copied to clipboard

Support discovery cache or up both QPS/BurstQPS

Open chobostar opened this issue 2 years ago • 1 comments

Dear, maintainers! Thanks for your awesome tool! I would like to improve overall experience of tkn usage and bring to you a case.

We use crossplane in our cluster, so we end up with tons of CRDs:

$ kubectl get crds | wc -l
477

this causes extremely slow tkn actions:

$ time tkn task start --showlog hello 
I0711 10:15:46.560753 2690678 request.go:665] Waited for 1.196588288s due to client-side throttling, not priority and fairness, request: GET:https://api-server.example.com:6443/apis/azure.crossplane.io/v1alpha3
I0711 10:15:56.561403 2690678 request.go:665] Waited for 11.196579935s due to client-side throttling, not priority and fairness, request: GET:https://api-server.example.com:6443/apis/cache.aws.crossplane.io/v1alpha1
I0711 10:16:06.761036 2690678 request.go:665] Waited for 21.395650905s due to client-side throttling, not priority and fairness, request: GET:https://api-server.example.com:6443/apis/eks.aws.crossplane.io/v1alpha1
Error: Task name hello does not exist in namespace default

real	0m31,985s
user	0m0,382s
sys	0m0,070s

here is similar discussion - https://github.com/kubernetes/kubectl/issues/1126 here is solution for kubectl - https://github.com/kubernetes/kubernetes/pull/105520

As I debugged, tkn don't use cache:

$ strace -f -e trace=file tkn task start --showlog hello 2>&1 | grep 'kube/cache' | wc -l
0

how kubectl uses cache:

$ strace -f -e trace=file kubectl get pods 2>&1 | grep '.kube/cache' | head
[pid 2691748] openat(AT_FDCWD, "/home/user/.kube/cache/discovery/api_server.example.com/servergroups.json", O_RDONLY|O_CLOEXEC) = 6
[pid 2691748] openat(AT_FDCWD, "/home/user/.kube/cache/discovery/api_server.example.com/metrics.k8s.io/v1beta1/serverresources.json", O_RDONLY|O_CLOEXEC) = 6
[pid 2691746] openat(AT_FDCWD, "/home/user/.kube/cache/discovery/api_server.example.com/notificationchannel.example.com/v1alpha1/serverresources.json", O_RDONLY|O_CLOEXEC) = 7

The second. As I see here tkn supported QPS=5 - https://github.com/tektoncd/cli/issues/1506 Unfortunately, It's not enough for clusters with hundreds and thousands CRDs, and users still end up with slow actions.

Thanks for reading!

Feature request

  1. Ability to use local cache
  2. Raise up QPS and support BurstQPS during discovery.

Use case

Speed up tkn actions.

UI Example

N/A

chobostar avatar Jul 11 '22 04:07 chobostar

/assign @piyush-garg

pradeepitm12 avatar Aug 16 '22 12:08 pradeepitm12

You might get this for free if you update to the latest client-go, which includes https://github.com/kubernetes/kubernetes/pull/109141

negz avatar Aug 17 '22 03:08 negz

Fixed by #1693

piyush-garg avatar Sep 02 '22 09:09 piyush-garg

/close

piyush-garg avatar Sep 02 '22 09:09 piyush-garg

@piyush-garg: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot avatar Sep 02 '22 09:09 tekton-robot

@chobostar we have increased the QPS and Burst same as kubectl, is it working fine now?

piyush-garg avatar Sep 06 '22 11:09 piyush-garg

@piyush-garg

Before:

$ tkn version
Client version: 0.24.0
...

$ time tkn task start --showlog hello 
I0906 19:28:32.642669  382538 request.go:665] Waited for 1.196621458s due to client-side throttling, not priority and fairness, request: GET:https://api.example.com:6443/apis/agent.open-cluster-management.io/v1
I0906 19:28:42.642881  382538 request.go:665] Waited for 11.195896779s due to client-side throttling, not priority and fairness, request: GET:https://api.example.com:6443/apis/database.aws.crossplane.io/v1beta1
I0906 19:28:52.642909  382538 request.go:665] Waited for 21.194934285s due to client-side throttling, not priority and fairness, request: GET:https://api.example.com:6443/apis/sqs.aws.crossplane.io/v1beta1
Error: Task name hello does not exist in namespace default

real	0m31,971s
user	0m0,432s
sys	0m0,179s

After:

$ ./tkn version
Client version: 0.26.0
...

$ time ./tkn task start --showlog hello
Error: Task name hello does not exist in namespace default

real	0m1,763s
user	0m0,254s
sys	0m0,086s

Thanks guys! Now it's much much better!

chobostar avatar Sep 06 '22 13:09 chobostar