cylc-flow
cylc-flow copied to clipboard
flow=all in a restarted completed workflow
https://cylc.discourse.group/t/adding-new-task-to-succeeded-workflow-explicit-insertion/520/3
[UPDATE: supersedes #5072 ]
Scenario:
- restart a completed workflow (with
--pause
to keep the scheduler up, but see #5078 ) after adding new tasks to the graph - then trigger a new task, or trigger on old task that should spawn a new task (according to the new graph)
Result:
- the default
flow=all
trigger results inflow=none
because there are no existing flows to belong to (the workflow completed already under the original graph)
So to make the new tasks flow on you have to explicitly set a flow number: cylc trigger --flow=1
.
This is correct behaviour according to our flow triggering rules, but in this particular case it won't be exactly obvious to users.
Question: should new task(s) run automatically - with the default trigger - in this scenario?
If yes, we'll have to treat triggering differently in a restarted completed workflow:
- "all existing flows" -> "all previous flows"?
- "all existing flows" -> "a new flow"? (this would result in a new flow even if triggering a past task)
Proposal: if there are no existing flows, triggered tasks should default to [UPDATE after discussion below]~~all~~ the most recent previous flow~~s~~.
This can only happen:
- on restarting a completed workflow
- and/or if there is nothing left but active no-flow tasks
(Note we can't use the most recent previous flow of the triggered task, because the triggered task could be newly added to the graph in which case it never ran before).
Proposal: if there are no existing flows, triggered tasks should default to all previous flows.
I don't think this is safe. When triggering historical tasks I think it would sometimes act like a new flow and sometimes not (depending on whether you had previously triggered a flow on a branch that didn't merge back with the main flow). Needs further thought. Treating this a new flow may be the best we can do?
When triggering historical tasks I think it would sometimes act like a new flow and sometimes not (depending on whether you had previously triggered a flow on a branch that didn't merge back with the main flow).
Not sure I follow that. If we give the triggered task ALL previous flow numbers, and it is a historical task (i.e. it already ran at least once before), then it will not flow on unless the previous run of it did not spawn children (e.g. because it failed). But at least the reason it won't flow on will be clearer (as opposed to getting --flow=none
without asking for that).
Treating this a new flow may be the best we can do?
It seems to me this violates your stipulation (which I now agree with, given what we've ended up with as default behaviour!) that users should not be exposed to "new flows" unless they deliberately and explicitly opt in.
Needs further thought.
OK, perhaps we should aim for consistency across scheduler shutdown/restart. On restart, interrogate the DB to discover the final task that ran just before shutdown, and give the newly triggered task the flow numbers of that task. That way, the new task behaves the same whether triggered just before shutdown (getting all current flow numbers) or just after restart.
(Not sure why I didn't suggest this in the first place ... )
I'd interpreted "all previous flow numbers" to mean all flow numbers from the entire workflow. Does it actually mean all previous flow numbers from the triggered task? If so then I think that's OK.
On restart, interrogate the DB to discover the final task that ran just before shutdown, and give the newly triggered task the flow numbers of that task.
I think this suffers the same issue I was concerned about. The final task to run might have been a new flow on an isolated branch.
The final task to run might have been a new flow on an isolated branch
Yes, but if you manually triggered the next task before shutdown - while that final off-piste task was still running - instead of after restart, it would get that same flow number anyway. Why is that a problem? Flows don't have to be contiguous (in the graph) when manual triggering is involved.
Yes, but if you manually triggered the next task before shutdown - while that final off-piste task was still running - instead of after restart, it would get that same flow number anyway.
Good point - you've convinced me!
OK, removing the "question" label, we'll go with "most recent previous flow" for the reasons discussed.