kube
kube copied to clipboard
Support pagination for watcher
Would you like to work on this feature?
maybe
What problem are you trying to solve?
Hi all, first thanks to kube-rs! It help a lot of my works on kubernetes with rust.
I noticed the Reflector from client-go uses a Pager to retrieving large results sets in chunks during listing resources: https://github.com/kubernetes/client-go/blob/0bc005e72ff13ab4ceffd5c4e0ecb1774a7bf7f8/tools/cache/reflector.go#L274-L278
If it is possible to introduce pagination into kube-rs to reduce impact on Kubernetes API Server while fetching large results of resources?
Thanks for any suggestions!
Describe the solution you'd like
The official documents of Kubernetes describe the way to retrieving large results sets in chunks.
The Watcher from kube-runtime should take list parameter limit on first list request and deal with continue_token for later list requests, there are also some cases need to be handle, e.g. Http 410 Gone.
Describe alternatives you've considered
Follow the best practice on Kubernetes Guides.
Documentation, Adoption, Migration Strategy
No response
Target crate for feature
kube-runtime
Hi!
While technically this shouldn't be too hard to implement into the watcher, it would break some assumptions that we make downstream. We basically have two options for implementing this:
- Hide this inside the watcher and buffer up the
Event::Restartedevent- I'm not sure this helps a huge amount, all this really does is move the buffering from the API server to the client
- However, it might help us stream the JSON parsing/deserialization, and load on the client might be preferable to load on the server
- Emit separate events for each chunk
- This would allow streaming output to the user (for clients that care about this), but is a pretty massively breaking change for existing users, and introduces a lot of downstream complexity
- Depending on how we implement this, reflector would either have to add its own internal buffering, or stop pruning deleted objects on resyncs
We might also be able to do a hybrid approach, where we introduce a new chunked_watcher, and reimplement watcher as a buffering layer on top of that...
Based on the alternatives, I'm wondering if emiting an Event::RestartChunk might make sense here. The Event api is not super useful outside kube_runtime except niche use cases writing custom reflectors / timed stores.
I feel that emitting pages could allow faster consumption of items from the apiserver as they appear (at least from watcher's POV), and if backpressure works properly for us (which it might, but not sure), it could also limit how quickly the list pagination happens - lessening the load on both us and the apiserver in cases where we exit early.
It is possible that we can make reflector stores do a smart internal buffering such as:
Event::RestartChunk(Vec<K>)appears from watcher- Store immediately snapshots current
ObjectRefs - Store replaces elements in chunk with
RestartChunk(Vec<K>)and clears corresponding entries from snapshotted objectrefs - goto 1 until completed (not sure how we know if the chunk is the last chunk?)
- At the end of restarted, delete uncleared objectrefs from store
Provided we are able to learn when we receive the last page. It shouldn't use a whole lot more memory since we can just overwrite and keep a small list of things we have seen, and diff that at the end with what we didn't see (which must have been removed).
I'm also interested on this feature, so I could try submitting a PR if we have a consensus on the design.
But I'd start with the watcher only, not the reflector, as I don't have any experience with it, as well as making the scope of the PR much smaller.
How does that sound?
Edit: the chunked response from API server contains a continue value that is empty when it's the last page.