kube icon indicating copy to clipboard operation
kube copied to clipboard

Support pagination for watcher

Open xdatcloud opened this issue 3 years ago • 3 comments

Would you like to work on this feature?

maybe

What problem are you trying to solve?

Hi all, first thanks to kube-rs! It help a lot of my works on kubernetes with rust.

I noticed the Reflector from client-go uses a Pager to retrieving large results sets in chunks during listing resources: https://github.com/kubernetes/client-go/blob/0bc005e72ff13ab4ceffd5c4e0ecb1774a7bf7f8/tools/cache/reflector.go#L274-L278

If it is possible to introduce pagination into kube-rs to reduce impact on Kubernetes API Server while fetching large results of resources?

Thanks for any suggestions!

Describe the solution you'd like

The official documents of Kubernetes describe the way to retrieving large results sets in chunks.

The Watcher from kube-runtime should take list parameter limit on first list request and deal with continue_token for later list requests, there are also some cases need to be handle, e.g. Http 410 Gone.

Describe alternatives you've considered

Follow the best practice on Kubernetes Guides.

Documentation, Adoption, Migration Strategy

No response

Target crate for feature

kube-runtime

xdatcloud avatar May 19 '22 08:05 xdatcloud

Hi!

While technically this shouldn't be too hard to implement into the watcher, it would break some assumptions that we make downstream. We basically have two options for implementing this:

  1. Hide this inside the watcher and buffer up the Event::Restarted event
    • I'm not sure this helps a huge amount, all this really does is move the buffering from the API server to the client
    • However, it might help us stream the JSON parsing/deserialization, and load on the client might be preferable to load on the server
  2. Emit separate events for each chunk
    • This would allow streaming output to the user (for clients that care about this), but is a pretty massively breaking change for existing users, and introduces a lot of downstream complexity
    • Depending on how we implement this, reflector would either have to add its own internal buffering, or stop pruning deleted objects on resyncs

We might also be able to do a hybrid approach, where we introduce a new chunked_watcher, and reimplement watcher as a buffering layer on top of that...

nightkr avatar May 19 '22 10:05 nightkr

Based on the alternatives, I'm wondering if emiting an Event::RestartChunk might make sense here. The Event api is not super useful outside kube_runtime except niche use cases writing custom reflectors / timed stores.

I feel that emitting pages could allow faster consumption of items from the apiserver as they appear (at least from watcher's POV), and if backpressure works properly for us (which it might, but not sure), it could also limit how quickly the list pagination happens - lessening the load on both us and the apiserver in cases where we exit early.

It is possible that we can make reflector stores do a smart internal buffering such as:

  1. Event::RestartChunk(Vec<K>) appears from watcher
  2. Store immediately snapshots current ObjectRefs
  3. Store replaces elements in chunk with RestartChunk(Vec<K>) and clears corresponding entries from snapshotted objectrefs
  4. goto 1 until completed (not sure how we know if the chunk is the last chunk?)
  5. At the end of restarted, delete uncleared objectrefs from store

Provided we are able to learn when we receive the last page. It shouldn't use a whole lot more memory since we can just overwrite and keep a small list of things we have seen, and diff that at the end with what we didn't see (which must have been removed).

clux avatar May 23 '22 21:05 clux

I'm also interested on this feature, so I could try submitting a PR if we have a consensus on the design.

But I'd start with the watcher only, not the reflector, as I don't have any experience with it, as well as making the scope of the PR much smaller.

How does that sound?

Edit: the chunked response from API server contains a continue value that is empty when it's the last page.

goenning avatar Aug 09 '22 04:08 goenning