Can patchHelper provide an option to verify resourceVersion?
As a developer, I use the patchHelper in our controllers to patch our CR at the end of the reconcile(), like CAPI does.
We found a small problem that patchHelper uses client.MergeFrom() in Helper.patch() and Helper.patchStatus(), causing apiserver not to verify resourceVersion when patching.
Sometimes, the controller may reconcile old data (because the controller's cache is not updated in time), and the fields(like some timestamp or random ID) written by the last reconcile() will be overwritten by this reconcile().
We want to avoid overwriting fields due to reconciling old data, and we don't need different controllers to act on conditions of the same object. Can patchHelper provide an option to verify resourceVersion? Prevent patching when resourceVersion validation fails.
This issue is currently awaiting triage.
If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.
The triage/accepted label can be added by org members by writing /triage accepted in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/triage needs-information
I'm a little bit concerned about adding options that we are not using in CAPI for a couple of practical reasons, but I would like also to better understand the use cases because at first sight, this seems a symptom of the controller relying on something set in previous reconcile, which is generally a bad practice
@fabriziopandini I understand your concerns.
Another specific case we have is: CR waits for the execution of an asynchronous task. The task ID is recorded in status.taskID. Every reconcile() will check whether the task is completed, and delete the status.taskID if it is completed. Sometimes, after task A is executed, task B will be executed immediately. After the task ID of task B is recorded in status.taskID, the next reconcile() may also get the task ID of task A in status.taskID. Since task A has completed, the status.taskID will be deleted. But in fact, the deleted task ID belongs to task B, causing task B to be lost.
So it would be nice if CAPI patcher could verify resourceVersion.
This is already possible in controller runtime by using https://github.com/kubernetes-sigs/controller-runtime/blob/3d915df2799fcacd28e679bf1df4a59a957644c9/pkg/client/patch.go#L57-L71 (which the condition patches already force)
hi @fabriziopandini could we add the option MergeFromWithOptimisticLock to CAPI patch helper?
Does the solution suggest by @vincepri works? this will allow you to address the use case using CR Patch without CAPI implementing an option we don't use - we will struggle to prevent regression on -.
But I'm also open to other options if there is consensus on how to address ^^ (or this if is not perceived as a concern).
Just saw https://github.com/kubernetes-sigs/cluster-api/issues/8706, which is somehow related
https://github.com/kubernetes-sigs/cluster-api/issues/8706 is exactly we meant. We hit bugs in the described scenario when using CAPI patchHelper.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.