kubectl icon indicating copy to clipboard operation
kubectl copied to clipboard

kubectl get doesn't stream results with many results

Open apelisse opened this issue 4 years ago • 18 comments

I've got a cluster here with many many objects of a specific resource type (probably too many), and I'm trying to list them anyway:

$ kubectl get myresourcetype
<hangs for ever, or not but who knows>

If I run the same request with v8, I can see that it's running a lot of requests to the apiserver, with the continue token etc. But it looks like none of these resources are going to be printed before they are all collected.

There is a chunk-size parameter, but that only guides the number of objects returned by each request, not how things are going to be displayed, there's also no limit parameter.

That seems wrong to me.

/assign @soltysh @seans3 cc @lavalamp

apelisse avatar May 04 '20 17:05 apelisse

/sig cli /area kubectl /kind bug

seans3 avatar May 27 '20 16:05 seans3

I would maybe consider this a feature more than a bug, but it's debatable :-)

apelisse avatar May 27 '20 16:05 apelisse

I was able to replicate this by creating 100k configmaps. I'll try poking around and see what I can figure out.

eddiezane avatar May 29 '20 17:05 eddiezane

I've tracked this down to being a bottleneck here:

https://github.com/kubernetes/kubernetes/blob/e422e9a3f41bfdf27c4bd2ebfabff40fe7a8b1e9/staging/src/k8s.io/kubectl/pkg/cmd/get/get.go#L498

Specifically here:

https://github.com/kubernetes/kubernetes/blob/e422e9a3f41bfdf27c4bd2ebfabff40fe7a8b1e9/staging/src/k8s.io/cli-runtime/pkg/resource/result.go#L122

Issue is in fact the chunk-size. By default it's 500 which means that this loop makes 200 requests to get my 100k configmaps.

Passing --chunk-size 100000 makes the command run nearly instantly because the API server can handle the request.

./_output/bin/kubectl get configmaps  4.98s user 0.98s system 14% cpu 41.915 total

vs

./_output/bin/kubectl get configmaps --chunk-size 100000  3.62s user 0.56s system 78% cpu 5.307 total

Where to go from here I'm unsure. I guess we'd need to refactor the Infos()/Visit() calls to pass results back in a channel. Might not be worth the effort?

Thoughts @apelisse?

cc @soltysh @seans3 @brianpursley

eddiezane avatar Jun 03 '20 20:06 eddiezane

I suspect there are two things that are going to make the non-caching behavior happen:

  1. If we have to sort the results, we need to receive them all fist. is there a non-sorting option? Can you look at the options here, maybe we might have an option that would say "don't sort just give me the results as fast as you can". 1
  2. I believe we do something relative to printing the header for the columns based on the content, so we need to know the length of names before we can print the columns. I would look at what is done for multi-gvk 2

And I may be missing other reasons.

Hope that helps, happy to answer further questions

apelisse avatar Jun 03 '20 20:06 apelisse

Hmmm

  1. As of now when sorting is enabled we set chunk-size = 0 to fetch in one request.

https://github.com/kubernetes/kubernetes/blob/e422e9a3f41bfdf27c4bd2ebfabff40fe7a8b1e9/staging/src/k8s.io/kubectl/pkg/cmd/get/get.go#L464-L468

  1. I'm not too sure on this one but Infos() is already called to fetch the requests before we reach the multi-gvk stuff.

eddiezane avatar Jun 03 '20 21:06 eddiezane

I'm not too sure on this one but Infos() is already called to fetch the requests before we reach the multi-gvk stuff.

Yeah, keep digging there, you might find something useful.

https://github.com/kubernetes/kubernetes/blob/e422e9a3f41bfdf27c4bd2ebfabff40fe7a8b1e9/staging/src/k8s.io/kubectl/pkg/cmd/get/get.go#L464-L468

Isn't it sorting by default? I'm a little confused. I should take a minute to look at the sorting option for kubectl get.

apelisse avatar Jun 03 '20 21:06 apelisse

Isn't it sorting by default? I'm a little confused. I should take a minute to look at the sorting option for kubectl get.

https://github.com/kubernetes/kubernetes/blob/e422e9a3f41bfdf27c4bd2ebfabff40fe7a8b1e9/staging/src/k8s.io/kubectl/pkg/cmd/get/get.go#L210-L214

Unless provided sort-by is an empty string.

https://github.com/kubernetes/kubernetes/blob/e422e9a3f41bfdf27c4bd2ebfabff40fe7a8b1e9/staging/src/k8s.io/kubectl/pkg/cmd/get/humanreadable_flags.go#L124

eddiezane avatar Jun 03 '20 22:06 eddiezane

Sure, can you dig deeper and make sure that they are not eventually sorted anyway on a simple get? If there is no sorting involved, that would probably make things much simpler.

And also, I would suggest that you keep digging on this multi-gvk thing.

Thanks @eddiezane!

apelisse avatar Jun 04 '20 16:06 apelisse

/priority P2

eddiezane avatar Jun 24 '20 16:06 eddiezane

@eddiezane @apelisse I think this is a throttling issue. Take a look at this PR and let me know what you think: https://github.com/kubernetes/kubernetes/pull/92483

EDIT: I guess this doesn't fix the original issue, but fixes a related problem of poor performance with chunking large lists. The original issue of not being able to see partial results before all chunks are received remains.

brianpursley avatar Jun 24 '20 19:06 brianpursley

I looked into this a little more and discovered that in order to stream results as they are received, kubectl get needs to be changed to use Visit() instead of Infos()

https://github.com/kubernetes/kubernetes/blob/6d7a9048b67d81e57923892650a23636f1afba42/staging/src/k8s.io/cli-runtime/pkg/resource/result.go#L91-L101

Visit() would allow row-by-row printing, whereas Infos() returns the entire result after it is received.

It isn't too straightforward though because Visit would not work in the case of sorting, so it would still need to keep the current ability to get all results up-front.

brianpursley avatar Jun 24 '20 22:06 brianpursley

This (sorting) was one of the reasons we changed to Infos(). In addition, we've added pre-processing and post-processing hooks in apply. The sorting/ordering of resources as well as filtering can happen in the pre-processing hook. The post-processing hook is now where the prune functionality lives.

seans3 avatar Jun 24 '20 22:06 seans3

Agree about the post/pre processing in apply, but I'm hoping this is unrelated because it's get. I agree that sorting will make it difficult, but in that case, would be nice to have an option to disable sorting altogether in order to stream the results

apelisse avatar Jun 24 '20 22:06 apelisse

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Sep 22 '20 22:09 fejta-bot

/remove-lifecycle stale

seans3 avatar Sep 23 '20 00:09 seans3

/lifecycle frozen

seans3 avatar Sep 23 '20 00:09 seans3

cc @KnVerey

eddiezane avatar Mar 10 '21 17:03 eddiezane