cylc-flow
cylc-flow copied to clipboard
reload/restart: parentless tasks
If parentless tasks are added to the workflow config, they are not spawned on reload/restart.
The Issue
E.G. If you start this workflow:
P1Y = foo
Then modify it to:
P1Y = foo => bar
Then the bar
task will run after foo
as expected.
However, if you modify it to:
P1Y = foo => bar
P2Y = baz
The bar
task will not be spawned / run. This issue only applies to parentless tasks due to their special spawn logic, all other graph changes will be applied. This is inconsistent and rather confusing for users.
I don't think (or can't remember) discussion on this one so I think this behaviour is undefined hence the question label. It could be argued this is correct by SoD implementation details, or that it's a bug because the behaviour is inconsistent / unexpected.
Either way, we need to agree what should happen here and at least document it.
Auto Insert Solution
On reload, we already detect any tasks which have been added to the workflow. So we could easily use this to spawn the new task out to the runahead limit as the user intended. No problem! Which flows to spawn the task in however, may be a problem?
On restart, we don't have the old configuration handy for comparison so can't even detect newly added tasks. I think we have two options for determining what tasks were in the workflow pre-restart:
- The database.
- However, this could become convoluted by interactions with
cylc remove
. If a task is manually removed in such a way that all future instances were cancelled, then we wouldn't want a restart to undo this.
- However, this could become convoluted by interactions with
- The
log/config
files.- As of Cylc 8 we now back up the config to the
log/config
directory on start/restart/reload so can easily load the old config. - Fine in principle, and might help with other matters, however, we haven't relied on the files in this directory in the past.
- For larger workflows this will up to double the config load time. Not sure if that's actually a problem though.
- As of Cylc 8 we now back up the config to the
So two problems to resolve in order to auto-insert parentless tasks:
- What flows to insert into?
- How to determine what tasks have been added to the config on restarts?
Docs Solution
If we have a solution to question (1) above, then we could at lest log a warning, alerting the user of the tasks they need to insert.
This is a known feature, and not a bug (not that you are definitely saying it is a bug!) but I agree some users will be surprised by it.
What to do after restart in the old => new
scenario is well-defined:
- the new task will naturally be spawned whenever an instance (at a particular cycle point) of the old one succeeds
The new => whatever
scenario is ambiguous though:
- which instance (cycle point) should be chosen as the first one?
- in principle this matters; it might be an expensive task, or it might fail behind a certain cycle point
So in general I think this requires direction from the user, to specify which instance of new
task to start with. Which is not that much different from requiring the user to manually trigger the first instance (as is the case now).
I suppose we could choose a default behavior though, which might be good enough most of the time. The obvious default might be:
- spawn
new
from the earliest valid point in the current active window?
To do this, I think we can easily detect addition of new tasks and check if they're parentless. A nastier possibility though is a reload/restart that makes an existing task become parentless.
(And that's without considering the flow ownership issue).
You mean:
P1Y = foo => bar
P2Y = bar
?
I can't remember if I solved this as part of the parentless xtrigger spawning type (#5738), if it's ready to run it should be spawned ideally? (unless user-directed otherwise)