cylc-flow icon indicating copy to clipboard operation
cylc-flow copied to clipboard

sub-optimal spawning of tasks that lose their parents

Open hjoliver opened this issue 4 years ago • 0 comments

Parent-less tasks have no parents to spawn them "on demand", so we have to auto-spawn them:

P1 = "foo"  # foo must be auto-spawned

To stick within the runahead limit and avoid an infinte-spawning apocalypse, we do this by spawning a parent-less task when its previous instance drops out of the runahead pool into the main pool. (Above, when foo.3 leaves the runahead pool, spawn foo.4 into the runahead pool, and so on).

Note this does not impose any artificial sequential behavior on the execution of foo - so that's all fine. However, things are less optimal if foo initially has parents and then loses them:

R2/^/P1 = "dad => foo"
P1 = "foo"

Ideally (constrained only by the runahead limit) foo.3 should be able to start concurrently with dad.1 here, to run before foo.1 and foo.2 which have to wait on their parents. But with the implementation described above, foo.3 won't be spawned foo.2 drops out of the runahead pool, and foo.2 does not exist until it is spawned when dad.2 succeeds.

So this is non-optimal in two ways:

  • foo.3 is unnecessarily delayed until after dad.2 has spawned foo.2
  • foo.3 would be unnecessarily prevented from existing if dad.2 failed and thus did not spawn foo.2at all

Probably not hard to fix - the auto-spawn mechanism, including at start-up, just needs to use the available sequence info better.

hjoliver avatar Aug 06 '20 05:08 hjoliver