cylc-flow
cylc-flow copied to clipboard
Isolate ICP and FCP graphs using the runahead limit
A bit of late night fun (sad!).
Implement "start-up and shutdown graphs" (effectively isolated from the main graph so that perpetual dependence on R1 & $ tasks can be avoided) ... by manipulating the runahead limit.
These changes close #4912
Imagine a task prep
is supposed to prepare the run directory for all tasks; and clean
is supposed to tidy up at the end after all tasks have finished:
[scheduling]
cycling mode = integer
final cycle point = 5
isolate initial cycle point = True
isolate final cycle point = True
[[graph]]
R1 = "prep => foo"
P1 = "foo"
R1/$ = "foo => clean"
[runtime]
[[prep, clean, foo]]
script = sleep 10
With all tasks having the same run length, on master (without the new config items) 1/prep
runs at the same time as 2/foo, 3/foo, 4/foo, 5/foo
, then 1/foo
runs at the same time as 5/clean
. Which is clearly not the intention, so we'd need additional ugly dependencies to make it work.
Or ... on this PR branch, 1/prep => 1/foo
runs to completion, then 2/foo, 3/foo, 4/foo
run to completion, then 5/foo => 5/clean
runs. :boom:
Requirements check-list
- [x] I have read
CONTRIBUTING.md
and added my name as a Code Contributor. - [x] Contains logically grouped changes (else tidy your branch by rebase).
- [x] Does not contain off-topic changes (use other PRs for other changes).
- [x] Applied any dependency changes to both
setup.cfg
andconda-environment.yml
.
- [x] Appropriate tests are included (unit and/or functional).
- [x] Appropriate change log entry included.
- [x] (master branch) I have opened a documentation Issue at https://github.com/cylc/cylc-doc/issues/511
Assigning you as first reviewer @oliver-sanders, since it is was your idea to get "start-up graphs" cheaply and safely by manipulating the runahead limit (and a fine idea it was).
[[graph]]
R1 = "prep => foo"
P1 = "foo"
R1/$ = "foo => clean"
Not sure I like this. One of the key features of Cylc is the support for parallel cycles. In this example, if foo
is a very long task (or series of tasks) and you want to run lots of cycles in parallel, it's going to be very frustrating if you have to spend a long waiting for the first cycle to complete before the workflow can start running cycles in parallel as intended.
One way around this is to make the start and end tasks run on separate cycles:
[[graph]]
R1 = "prep"
P1!(^,$) = "foo" # or R/+P1/P1!$
R1/$ = "clean"
(syntax provided by @oliver-sanders !) However, the syntax is horrible and you also have to adjust the start and stop points.
It would be much nicer if we could do something like:
[[graph]]
start = "prep"
P1 = "foo"
finish = "clean"
I guess that's a lot more work?
@dpmatthews -
It would be much nicer if we could do something like:
Haha, that was my original suggestion, albeit explicitly restricted to start and finish graphs - you might have chimed in back on #4903!!
I guess that's a lot more work?
Well, I can think about it, if you and @oliver-sanders agree - he didn't like the separate graphs approach, which is how we ended up here. But maybe the restriction to "start" and "finish" graphs would make that acceptable - what do you think Oliver?
I think the runahead limit implementation is OK. It's opt-in, we can explain the potential downside and that prep[^] => foo
) is an alternative. And perhaps worth noting that in a workflow with many cycles, less than optimal scheduling in the first won't matter much.
However, the syntax is horrible and you also have to adjust the start and stop points.
Cool that our syntax support that though. But inter-cycle triggers would make that even worse (the offset would break pre-initial ignore).
Long story short, I also think this is better:
[[graph]]
start = "prep"
P1 = "foo"
finish = "clean"
If you look at the description of #4903 that's what I was aiming at, but maybe I shot myself in the foot there by trying to generalize it!
you might have chimed in back on https://github.com/cylc/cylc-flow/issues/4903!!
Sorry!
But inter-cycle triggers would make that even worse (the offset would break pre-initial ignore).
Yuck - hadn't thought of that.
@oliver-sanders agree - he didn't like the separate graphs approach
I'm ok with separate recurrences for the ICP/FCP. I think a fully abstracted multi-graph solution would require a lot more thought than we can reasonably provide at this point in time.
Long term it might also be the case that workflow modules / composition (e.g. the advanced handling of sub-workflows) may provide a simpler solution to this sort of problem.
Long story short, I also think this is better:
[[graph]] start = "prep" P1 = "foo" finish = "clean"
It's definitely better, the issue is trying to work out how to implement it! The "start" and "finish" cycles need to be assigned a "cycle point" but what should that be? We could do something like subtract PT1S
from the ICP at configuration time, but that's a bit icky (and won't necessarily work with cycle point format
). Or we could implement start
and finish
as special cycle points, but that requires patching all cycling interfaces to handle these special keywords.
Or we could implement start and finish as special cycle points, but that requires patching all cycling interfaces to handle these special keywords.
I was thinking along those lines, but I haven't thought through it in depth yet.
Superseded by #5090