Conflict resolution should rely only on resource naming
What would you like to be added:
Change conflict resolution rules to use only alphabetical order of "{namespace}/{name}". Stop using conflict resolution by "creationTimestamp".
Why this is needed:
Conflict resolution by "creationTimestamp" has a couple of problems:
- In case of auto-syncing policies across different k8s clusters when order is not preserved.
- In case of re-creating a cluster from scratch. Kube API Server doesn't provide a "happens-before" guarantee that "creationTimestamp" will be aligned with the order of API calls. From k8s ref:
CreationTimestamp is a timestamp representing the server time when this object was created. It is not guaranteed to be set in
happens-before order across separate operations. Clients may not set this value. It is represented in RFC3339 form and is in UTC.
Populated by the system. Read-only. Null for lists.
Conflict resolution by "creationTimestamp" complicates troubleshooting and causes indeterministic behavior.
While I agree with your points, IMO creationTimestamp is useful since it means that new reasources do not break existing ones. That mean if I introduce a conflict, I don't take down prod -- just my new config I introduced doesn't work
And this level of the conflict resolution should be last resort and always represents a misconfiguration by the user
If it's possible at all don't you think that it's likely a user would end up relying on this behavior? Should duplicates be rejected instead of relying on non deterministic ordering?
I guess this check at the admission webhook level is annoying and that's the reason why it's that way?
Its hard/not recommend to have cross resource validation (for many of the reasons mentioned in the initial description, actually). There can still be status messages warning though
Yes, it's absolutely intended that the "first created" rule is a last resort for if there's a configuration error. Any conflict resolved by this method should also generate Conditions in the status that warn about what happened and why, so that the conflict can be removed as quickly as possible.
To put this another way, I'd argue that the problems you mention should be solved in different places:
- Any automatic syncing mechanism should not be syncing objects with significant errors (which this sort of conflict should be counted as).
- Likewise, Backups really shouldn't be being made when an error like this is present.
I definitely agree that using creation timestamp as the deadlock breaker is not ideal, but it's the least worst option we have.
Does all of this answer your question @lahabana? I really appreciate you asking it, this is exactly the sort of thing that's hard to put into the spec itself.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale - Mark this issue or PR as rotten with
/lifecycle rotten - Close this issue or PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
If there's warnings in case of conflicts it's makes it obvious for user to not rely on this.
/close
@lahabana: Closing this issue.
In response to this:
If there's warnings in case of conflicts it's makes it obvious for user to not rely on this.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.