argo-workflows
argo-workflows copied to clipboard
Resume/suspend/terminate/stop will result in invalid state
Summary
Only the workflow controller should be able to change a workflow.
Motivation
All these operations can result in invalid state one large workflows because they update the workflow out of the main loop.
Proposal
The controller currently reacts to pod and workflow changes, but it can already be reacting when a third-party changes - resulting in conflict and invalid state.
We could have a new operation queue that takes operations to workflows and ensures they are applied sequentially.
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
See #2367
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I think this is no longer an issue? We have several checks in the codebase where the ShutdownStrategy is handled currently.
If they require changes to the Workflow (via spec or annotation or label), then they are more-or-less queued up as resourceVersion changes.
Only the workflow controller should be able to change a workflow.
Related is #12538 for the manual Retry operation
We could have a new operation queue that takes operations to workflows and ensures they are applied sequentially.
Also *Request CRDs as I mentioned in https://github.com/argoproj/argo-workflows/issues/6490#issuecomment-1961246329 would very explicitly delegate to the Controller and would be queued (as the entire Controller is queued).