cylc-flow
cylc-flow copied to clipboard
Pre sequence-start ignore
For cycling sequences that start after the suite initial point, I think we should have pre-sequence-start ignore. (c.f. pre-start ignore with respect to the suite initial cycle point).
Consider this suite, which does not rely on pre-start ignore:
[[dependencies]]
[[[R1]]]
graph = foo
[[[R/+P1Y/P1Y]]]
graph = foo[-P1Y] => foo
[[[R1/+P2Y/]]]
graph = bar
[[[R/+P3Y/P1Y]]]
graph = bar[-P1Y] => bar
Thanks to pre-start ignore we can simplify foo's graph to this:
[[[R//P1Y]]]
graph = foo[-P1Y] => foo
IMO we should be able to simplify bar's graph in the same way, reducing the example suite to this:
[[[R//P1Y]]]
graph = foo[-P1Y] => foo
[[[R/+P2Y/P1Y]]]
graph = bar[-P1Y] => bar
This would significantly reduce the complexity of real suites with delayed-start sequences. Pre-sequence-start ignore could not be universal though - you might legitimately want bar to trigger off an earlier foo. So we could do something like this to mark that a dependence should be ignored at sequence start-up:
[[[R/+P2Y/P1Y]]]
graph = """bar[-P1Y]* => bar # ignore bar[-P1Y] at sequence start
foo[-P1Y] => bar""" # don't ignore foo[-P1Y] at sequence start
@arjclark - thoughts? Could the existing pre-start ignore mechanism be reused for this, or does the explicit marking of ignoreable dependencies actually make it easier?
@hjoliver - in theory, any explicit marking you do should make things easier as the code is just stripping out pre-reqs that get marked for removal. I guess you'd need to make sure only the first item in the sequence has this filtering applied to rather than propagating it through the whole sequence and I think that'll be where any implementation nastiness will be, but it should be doable.
Since spawn-on-demand implementation, a workflow that only has dependencies like this will shutdown immediately on startup without doing anything (Cylc 8) rather than start up and immediately stall (Cylc 7). #4638
This seems to have risen in importance again, because of the arguably-correct but also arguably-unhelpful "premature shutdown" effect in Cylc 8.
So if we do this, note from my original description about:
Pre-sequence-start ignore could not be universal though - you might legitimately want bar to trigger off an earlier foo.
i.e., pre-initial-ignore would be fine here:
[[[R//P1Y]]]
graph = foo[-P1Y] => foo
[[[R/+P2Y/P1Y]]]
graph = bar[-P1Y] => bar
but (unlike for the ICP) not here:
[[[R//P1Y]]]
graph = foo[-P1Y] => foo
[[[R/+P2Y/P1Y]]]
graph = foo[-P1Y] => bar
I think for the cases you've shown the original logic still stands whenever something references a task before the initial cycle point because there's nothing you can infer about anything before the ICP. There's a couple of missing components/considerations though in the spawn on demand world/ones we never really reached consensus on...
- In many cases the "what do I do" is best answered by an initial cycle point definition of tasks, so a suite only written in the form of your comment could get in trouble, whereas one that is explicit about first cycle avoids the limbo of uncertainty we've tried to handle with pre-initial-cycling
- I have no idea what the correct behaviour for handling
?
type options should be for pre-initial filtering. Arguably a task like that with only pre-initial-cycle dependencies should never be spawned at all. - What to do in the cylc flow, kick things off partway through a suite's defined period (i.e. at a point past the ICP, possibly in a running suite) - cylc-reflow? - is unclear for how you handle those sort of dependencies. As per the earlier comment though (2016!) you should only really be stripping dependencies from the first instance of a task in the definition of the workflow, as opposed to the first instance created - a subtlety that now exists in the spawn on demand world...
- I think there's also some subtelty in the case of sections like these:
[[[R/+P2Y/P1Y]]]
which nominally might imply a task should exist within the scope of the workflow's defined cycle period that hasn't been created elsewhere. For the example you give above I think both are technically valid for all cycle points where all elements of the dependencies reference cyclepoints inside the suite definition. It does get gnarly though if you have, sayfoo[-P3Y] => bar
as in that situation the ICP filtering needs not only applying to the first instance of bar but to all instances of bar where foo[-P3Y] would reference a point outside the workflow entirely. I honestly don't know what the codebase does for that these days but is the most likely point of a gotcha.