kubebuilder
kubebuilder copied to clipboard
Docs suggestion: Optimistic Concurrency
Suggest adding following to KB docs.
@droot @akashrv
Sometimes a resource is updated while the controller is working on that resource. For example, a user might modify the resource's spec while the controller is computing and updating its status.
Kubernetes implements Optimistic Concurrency on the server side, so if client sends an update with stale object (older resource version), the server will reject the request and client will return with an error. KB handles this for you by requeuing the status update. Your controller will likely recompute status using latest copy of the object on the next invocation, and the update should succeed.
If you are using controller-runtime/client directly, then retrying is as simple as return Result{Requeue: true}
.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale
/lifecycle frozen
This one would be a great fit for a FAQ section where we could do the same clarifications done in: https://sdk.operatorframework.io/docs/best-practices/common-recommendation/
We need to let users know that the goal is to:
Develop idempotent reconciliation solutions When developing operators, it is essential for the controller’s reconciliation loop to be idempotent. By following the Operator pattern you will create Controllers which provide a reconcile function responsible for synchronizing resources until the desired state is reached on the cluster. Breaking this recommendation goes against the design principles of controller-runtime and may lead to unforeseen consequences such as resources becoming stuck and requiring manual intervention.
And that when in getting the resource in the reconciliation we store this data in a Golang structure so that before we use the client to update/change the resource would be recommended to get the latest version. Otherwise, we might face issues because the state of the resource on the cluster is no longer the same which should re-queue the reconciliation.
Error example faced in the logs: the object has been modified; please apply your changes to the latest version and try again
.
Since the above comment provides the description of what we need to do here, I think it is a good good first issue
I would like to work on tis issue @camilamacedo86 /assign
@camilamacedo86 Is this issue still relevant or should I drop this for now?
I think we can close this one. The best approach in this case is ensure that you fetch the resource before any update so that you will avoid this issue. However, @ashutosh887 if you or any person would like to doc this one it might be nice to be add it to the FAQ: https://book.kubebuilder.io/faq
Something as the following example
How to avoid the error "the object has been modified, please apply your changes to the latest version and try again
"? Why it occurs?
See that when are developing operators, it is essential for the controller's reconciliation loop to be idempotent. By following the Operator pattern[1], we will create Controllers[2] who provide a reconcile function responsible for synchronizing resources until the desired state is reached on the cluster. That means that the reconciliation is like a loop that will still be running until it can ensure the desired state on the cluster.
The error " the object has been modified; please apply your changes to the latest version and try again"
usually happens because we fetched the resource and stored it in a variable in our reconcile. Then after a while, we try to update this resource using the data stored in the variable/controller. However, the state of the resource on the cluster changed from when we fetched the data using the client until when we try to update it.
An alternative solution for the proposed idea, to use the server-side, can be to ensure that you re-fetch the resource (client.GET) before any update (client.Update).
I am closing this one but feel free to contribute with the docs if you wish.