argo-workflows
argo-workflows copied to clipboard
Add a `pendingTimeout` parameter
Summary
See https://github.com/argoproj/argo-workflows/issues/3572 for context.
Some of our workflows fails to schedule a k8s node because sometimes there are errors in the configuration that is responsible to execute a workflow.
The currently available option activeDeadlineSeconds considers both the pending phase and also the execution phase. We would need an option that only consider the pending phase so our failing pending workflow would be marked as failed after xxx seconds.
This new option could be pendingTimeoutSeconds or pendintDeadlineSeconds.
Message from the maintainers:
Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.
I investigated this issue and when I looked into a possible solution, I ran into one that actually already exists: #3686
The template timeout field as currently documented (https://argo-workflows.readthedocs.io/en/latest/fields/#template) sounds like a duplicate of the activeDeadlineSeconds field, but as actually implemented the node StartedAt time is when the workflow node was created, and timeout is only considered for nodes in the NodePending phase, thus making timeout more like pendingTimeout in practice.
I have verified that specifying timeout: 600s in my templates does indeed prevent them from spending more than 600s in Pending state, while allowing them to run for however long they need to.
Perhaps some improvement to the documentation is in order?
Update: while the timeout parameter does seem to catch pods that have been pending too long, it has some issues and consequences:
- The template timeout is only evaluated "incidentally" and is not guaranteed to be evaluated near to expiration time, so it's it's more of a "minimum" than a "maximum" parameter
- Template timeout is transferred to activeDeadlineSeconds if that param is unset or greater than template deadline. So, timeout is not just applicable to the pending state but rather a full end-to-end timeout I am investigating other options for resolution.