cylc-flow
cylc-flow copied to clipboard
`cylc remove` and/or `cylc forget`?
The cylc remove
command currently:
- (only) removes task proxies from the
n=0
pool - does not causing spawning downstream of the removed task
- use cases:
- remove incomplete tasks that we don't care about
- remove active-waiting tasks that we don't want to run when (e.g.) their xtrigger gets satisfied
- (or if we know their xtrigger will never be satisfied)
From @oliver-sanders (end of https://github.com/cylc/cylc-flow/issues/4686#issuecomment-1054260136 ):
- should we extend
cylc remove
ton>0
to make the scheduler forget that a task did run in a particular flow?- use case: back out of future flow merge consequences of manual task triggering
- if so, is this use case sufficiently different to require a new command, e.g.
cylc forget
? - can we combine
cylc remove
andcylc set-outputs
(or whatever that becomes after #4727)?- (probably not...)
I guess the n>0 use cases are the opposite of set-outputs i.e. unset-outputs.
Could do something like --unset
, might make more sense to handle there since it also needs --flow
.
Good point.
cylc outputs [--set] [--unset]
?
cylc outputs [--set] [--unset]
Not sure about this. I think the vast majority of cases will be setting outputs (the use cases for unset are much more obscure) so I wouldn't want to have to specify --set
.
Personally I don't mind if we stick with cylc set-outputs
+ cylc unset-outputs
or cylc set-outputs --unset
.
Actually, I think a separate command might be better since unsetting outputs is quite different in my mind:
- By default, apply to all active flows.
Use --flow to specify a specific flow (do not support
--flow=none
or--flow=new
). - By default remove all the outputs for a task.
i.e. make the scheduler forget that a task ran in a particular flow.
Use --output to specify a specific output (or outputs) to remove.
- We need to think about what to allow - does anything other than custom outputs make sense?
- This command is only intended to affect the database. It makes no sense when applied to a n=0 task - it should be ignored it this case (?).
cylc stop
currently supports -flow=INT
which means "Stop flow number INT from spawning more tasks".
I don't really like this for 2 reasons:
- If you
stop
a workflow can you restart it usingplay
. This isn't true if you "stop" a flow. - If you want to stop a flow I think it is more likely that you want to remove any tasks that belong to that flow from n=0 rather than simply preventing the spawning of more tasks.
The use case I am thinking of is that I want go back and rerun parts of a flow because, for example, system problems mean I've lost some of the data generated by the flow or perhaps there has been some data corruption or an error in the configuration of one of the tasks. Therefore, what I want to do is to start a new flow and then get rid of the old flows which means removing any tasks from the previous flows from n=0 (which are likely to be either held or failed in this scenario).
We could perhaps add a --flow
option to cylc remove
. However, I don't think users should have to worry about flow numbers (as far as possible). In this case, as a user I probably want to be able to just issue the cylc remove
command against one of the tasks from the old flow but add an option which means "remove all other tasks belonging to the same flow(s)" - --all-flow-members
?
- If you stop a workflow can you restart it using play. This isn't true if you "stop" a flow.
- If you want to stop a flow I think it is more likely that you want to remove any tasks that belong to that flow from n=0 rather than simply preventing the spawning of more tasks.
A quick response on this point (I have not thought much beyond this, so far).
I don't think (1) is necessarily a problem. A workflow consists (potentially) of multiple flows. If you stop all flows, the scheduler shuts down because there's nothing left for it to do (and a restart won't do anything). If you "stop the workflow", aka "stop the scheduler", the scheduler shuts down without stopping any flows, and the workflow (with all of its flows) can be restarted again.
In the middle ground, it seems pretty natural to me to "stop" a flow; which will not cause the scheduler to shut down if it still has other flows to manage.
However, you raise an interesting point... which I take to mean we should consider allowing cylc play --flow=N
to restart a stopped flow in a running workflow! (maybe I regret stopping that flow...)
On 2., we really need both. Removing a flow number from all n=0 tasks to prevent that flow from continuing is quite nice, conceptually, and it would work perfectly well if we had a 100% spawn-on-demand scheduler. Unfortunately we don't have that, hence active-waiting tasks.
So: we need to: a) remove the to-be-stopped flow number from active tasks (but not remove those tasks from n=0 as they're already active) b) and remove active-waiting tasks from n=0, if they have that flow number
However, you raise an interesting point... which I take to mean we should consider allowing cylc play --flow=N to restart a stopped flow in a running workflow! (maybe I regret stopping that flow...)
That's sounds really tricky to me - I'm not convinced we should support it (which is, perhaps, why I still question supporting flows in cylc stop
)
cylc play
has a similar dual purpose: to start a scheduler and to un-pause a running one.
Another thing that perhaps argues against using cylc remove
to stop flows, unless we can interpret it as "removing flow numbers": If I want to stop flow 2, say, I have to remove flow number two from active tasks and active-waiting tasks. I can't remove the active tasks, because they're already submitted or running (but removing the flow number stops them from continuing flow 2). But also, I can't remove the active-waiting tasks if they still have other (non-2) flow numbers, or it would stop those flows as well.
This is somewhat off-topic for this issue, I'll open another one... #4741
From project meeting:
-
remove
is not a good command name in Cylc 8, it requires users to understand the scheduler task pool concept (the content of which is no longer obvious in the GUI) - forced expiry seems better: use
cylc expire
(?) to signal that the scheduler can forget about an incomplete task instead of waiting for it to be retriggered or whatever (it will be forgotten by means of removal from the pool, but the user doesn't need to know that). Evidently Cylc 7 state reset (to expired) is sometimes used in this way.- we probably need to allow forced expiry of future tasks too (with no task proxy yet). c.f. future tasks to hold
See https://cylc.github.io/cylc-admin/proposal-cylc-set.html
Superseded by https://github.com/cylc/cylc-flow/issues/5643