client-go
client-go copied to clipboard
Confusion about the "resync" of informers
This is a request to reopen the existing issue for further discussion. The issue pertains to the need for periodic resync in the work process of informers.
The work process of infomers that i figured out:
- List all the resources according to the options given at first, and then initialize the indexer (local cache)
- A Watch Loop watch the events of ADD,UPDATE,DELETE , put (obj, event) into delta_fifo.
- Pop the (obj, event) from delta_fifo, sync the indexer(local cache) by the event, and distribute event to listeners which is interested in the events of these resources.
But i see the delta type: sync; when doing sync(or called resync method) operation, the delta fifo gets all objects from indexer(local cache), and then re put them into
delat fifo
(if the obj is not in the fifo currently) and then trigger an update event to listeners. If there is a risk that the client will lost some events, why not just sync them from api server? because i think the data source of indexer (local cache) is just from the delta fifo, whatwill we benefit from the periodic resync method?
This becomes even more interesting when we examine the typical informer's handlers, where we check events by their resource version
. As a result, resync may become redundant(e.g. traefik)
Hi @stillya,
From your issue I see that the main questions you're trying to answer are (correct me if I'm wrong):
- Why not just sync events from the API server?
- What are the benefits of the periodic
resync
ing? -
resync
may become redundant* (not a question, but it's still here)
All these questions come down to the conceptual difference between caching and direct accessing:
-
Why not just sync events from the API server?
While it's possible to obtain the events via the Kube API (in fact, from
etcd
) there are number of reasons to avoid this:- cache contains already deserialized and decompressed data
- requesting from
etcd
might be very time-consuming, especially in a large clusters or when dealing with the high event frequency - this increasing Kube API load, which may affect the overall health of the cluster
-
What are the benefits of the periodic
resync
ing?- cache is a redundancy* layer - if API is unreachable or if there're other net issues, cache ensures you still can operate using the cached data
- cache reduces the load on the API server, allowing you to avoid costly operations
* - the thing is that redundancy in case of caching is a good thing, because it effectively allows you to avoid data loss, repeated requests and improves the overal performance.
Let me rephrase Brown's quote: «It's better to be prepared for an issue and not have one, than to have an issue and not be prepared.»
Thank you, @soulless-viewer. It's becoming clearer, but do you have any best practices for using resync? Because all informers I've seen so far look like this, so when resync happens, it will have the same resource version, making the resync operation useless. Should we consider storing events that fail to be handled in some way or something like that?
@stillya I think your question is more around "Relist" vs "Resync". If so, this link should provide you with more clarity https://hex108.gitbook.io/kubernetes-notes/fu-lu-rtfsc/informer
This is a request to reopen the existing issue for further discussion. The issue pertains to the need for periodic resync in the work process of informers.
The work process of infomers that i figured out:
- List all the resources according to the options given at first, and then initialize the indexer (local cache)
- A Watch Loop watch the events of ADD,UPDATE,DELETE , put (obj, event) into delta_fifo.
- Pop the (obj, event) from delta_fifo, sync the indexer(local cache) by the event, and distribute event to listeners which is interested in the events of these resources.
But i see the delta type: sync; when doing sync(or called resync method) operation, the delta fifo gets all objects from indexer(local cache), and then re put them into
delat fifo
(if the obj is not in the fifo currently) and then trigger an update event to listeners. If there is a risk that the client will lost some events, why not just sync them from api server? because i think the data source of indexer (local cache) is just from the delta fifo, whatwill we benefit from the periodic resync method?This becomes even more interesting when we examine the typical informer's handlers, where we check events by their
resource version
. As a result, resync may become redundant(e.g. traefik)
Resync will put all the data of the indexer back into FIFO and trigger an update. My question is: if this logic is designed to worry about errors during the first update processing, then there may be execution errors in the update, and ADD and Delete may also have this problem. Why did ADD and Delete not callback again.