Make `cylc remove` flow-aware and extend to historical tasks
Partially addresses #5643
Summary
This mostly implements the "Cylc Remove Extension" proposal.
Flow numbers
cylc remove now has a --flow option for removing a task from specific flows.
If not used, it will remove the task from all flows that it belongs to.
If the removed task is active/waiting, if it is removed from a subset of flows that it belongs to, it will remain in the task pool; if it is removed from all flows that it belongs to, it will be removed from the task pool (as is the current behaviour).
If a task is removed from all flows that it belongs to, it will become a no-flow task (flow=None).
For ease of reviewing, you can use my UI branch that displays flow numbers: https://github.com/MetRonnie/cylc-ui/tree/flow-nums [^c].
[^c]: Waiting tasks that are not yet in the pool have greyed out flow numbers at the moment.
Historical tasks
cylc remove now can remove tasks that are no longer active, making it look like they never ran. It does this by removing the task from the specified flows in the database (in the task_states and task_outputs tables)[^a], and un-setting any prerequisites of active tasks that the removed task had naturally satisfied[^b]. If a task is removed from all flows that it belongs to, a no-flow task is left in the DB for provenance.
The above also applies to active/waiting tasks that cylc remove is used on.
[^a]: If removing flows would result in two rows in the DB no longer being unique, the SQLite UPDATE OR REPLACE statement is used, so the first entry will be removed and the most recent entry will remain.
[^b]: Prerequisites manually satisfied by cylc set --pre are not affected by cylc remove.
What's left to do
-
When removing an active task from all its flows, kill the task.
-
Should probably add a functional test with the
--flowoption. -
Need to check this one:
If removing a task causes all of the prerequisites on a downstream task to be unset (i.e. if the downstream task was spawned as a result of outputs from this task alone) then the downstream task shall be removed from the pool.
Check List
- [x] I have read
CONTRIBUTING.mdand added my name as a Code Contributor. - [x] Contains logically grouped changes (else tidy your branch by rebase).
- [x] Does not contain off-topic changes (use other PRs for other changes).
- [x] No dependency changes
- [x] Tests are included
- [x] Changelog entry included if this is a change that can affect users
- [ ] Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
- [x] If this is a bug fix, PR should be raised against the relevant
?.?.xbranch.
Code looks good, trying out some manual testing.
I'm not having much luck removing n=-1 tasks (i.e. tasks which have just succeeded). Here's an example:
[scheduling]
cycling mode = integer
initial cycle point = 1
[[graph]]
P1 = a[-P1] => a
[runtime]
[[root]]
[[a]]
script = sleep 1
completion = (submitted and succeeded)
# start the workflow
cylc vip --pause
# run task 1/a
cylc play
cylc pause
# remove task 1/a
cylc remove 1/a
Results:
- The DB is updated correctly.
- However the prereq
1/a -> 2/ais marked as satisfied in thecylc showoutput. - When I resume the workflow
2/aruns (note it was in the queue when1/awas removed).
(Test failure is just flaky tui test, not bothering re-running a 3rd time 🤮)
Have been trying this out for sub-graph re-run use cases. The remove functionality is all working correctly :+1:, I am able to re-run sub-graphs cleanly without using new flows :rocket:.
I have encountered some hitches (not related to this PR, all for consideration in follow-on work):
- I've been hitting this issue a lot (https://github.com/cylc/cylc-flow/issues/6143) with any dependencies which span more than one "rank" in the graph. In these situations, the GUI and DB say one thing, but the task pool another which is highly confusing. Restarting (or possibly reloading?) the workflow should fix it, but we need a solution to this before remove really becomes a viable solution to re-running sub-graphs. Note this is a very similar limitation to the need to satisfy off-flow prereqs when re-running via new flows.
- Removing a task doesn't necessarily change anything in the GUI (i.e. there is no visual feedback). This was anticipated when the proposal was written and is why https://github.com/cylc/cylc-ui/issues/470 is on the Cylc roadmap. I wonder whether a global change to the appearance of
n=0tasks (e.g. making them translucent as originally proposed) would provide an easier workaround and also help with other things? - Removed tasks retain their state in the GUI. E.G. if a task succeeded and was removed, it continues to be shown with a filled circle. It makes sense to preserve the job state, however, I wonder if we should reset the task state back to waiting when we remove the task to make it clearer?
What's the status on these TODO items from the OP:
What's left to do
When removing an active task from all its flows, kill the task.
Should probably add a functional test with the --flow option.
Need to check this one:
Will either come in a follow-up PR or this one depending on how soon @hjoliver reviews this
Reading:
This means prerequisites which have been satisfied by cylc trigger or cylc set will not be unset as the result of cylc remove as these have been manually satisfied by separate interventions.
I tried
[scheduler]
allow implicit tasks = True
[scheduling]
[[graph]]
R1 = foo => bar => baz => qux
$ cylc vip . --pause -n remove/one --pause
$ cylc trigger remove/one//1/qux/
Pre-requisite table in DB
| cycle | name | flow_nums | prereq_name | prereq_cycle | prereq_output | satisfied |
|---|---|---|---|---|---|---|
| 1 | qux | [1] | baz | 1 | succeeded | 0 |
$ cylc remove remove/one//1/qux
Then removes the DB entry. My reading of the spec is that this shouldn't be happening.
This means prerequisites which have been satisfied by cylc trigger or cylc set will not be unset as the result of cylc remove as these have been manually satisfied by separate interventions.
This statement is in relation to prerequisites satisfied by the task being removed, not the prerequisites of the removed task itself.
If removing a task causes all of the prerequisites on a downstream task to be unset (i.e. if the downstream task was spawned as a result of outputs from this task alone) then the downstream task shall be removed from the pool.
This is proving to be tricky, as there is a difference between how prerequisites are handled for these 2 cases:
[scheduling]
[[graph]]
R1 = """
(a1 | a2) & b => c
"""
and
[scheduling]
[[graph]]
R1 = """
(a1 | a2) => c
b => c
"""
In the first case, there is only 1 prerequisite that includes both a1 | a2 and b, whereas in the second there are 2 prerequisites - 1 for a1 | a2 and 1 for b. In the first case, removing b would result in c being removed from the task pool even if a1 or a2 had succeeded, and I am not sure there is a way to tell this without another refactor of the Prerequisite class.
Edit: I was wrong. In any case, we should remove the child task from the pool only if it has no satisfied prerequisite tasks, regardless of whether they are in any AND/OR conditions
Where are we up to with these TODO items?
What's left to do
When removing an active task from all its flows, kill the task.
Should probably add a functional test with the --flow option.
Need to check this one:
If removing a task causes all of the prerequisites on a downstream task to be unset (i.e. if the downstream task was spawned as a result of outputs from this task alone) then the downstream task shall be removed from the pool.
Where are we up to with these TODO items?
(Edited) I've updated the OP
Getting in review mode here - sorry for the delay!
@MetRonnie - is this still in draft because of the two "what's left to do" items at the top?
Yes, because Oliver wanted to wait for the whole proposal to be addressed before merging. (FYI I'm going to close this PR and open a new one to stop GitHub collapsing recent review comments)
Superseded by #6472