cylc-flow icon indicating copy to clipboard operation
cylc-flow copied to clipboard

Flow-specific hold/release

Open hjoliver opened this issue 1 year ago • 7 comments

Close #4277

Hold/release tasks in a specific flow (or any flow, by default)

  • [x] Implement cylc hold --flow=n
  • [x] Same for cylc release --flow=n

If --flow is not used, matching tasks will be held or released regardless of flow number.

  • (most of the time there will only be one flow)

If --flow=n is specified, matching tasks will be held or released if their flow numbers include n

  • (if a merged task foo has flow numbers {2,3} it belongs to flow 2 (and 3), so if I ask to hold foo in flow 2 (or 3) it should be held. Ditto for release).

Check List

  • [x] I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • [x] Contains logically grouped changes (else tidy your branch by rebase).
  • [x] Does not contain off-topic changes (use other PRs for other changes).
  • [x] Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • [x] Tests are included (or explain why tests are not needed).
  • [x] CHANGES.md entry included if this is a change that can affect users
  • [x] Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/646
  • [x] If this is a bug fix, PR should be raised against the relevant ?.?.x branch. (not a bug fix)

hjoliver avatar Aug 17 '23 02:08 hjoliver

Test case:

[scheduling]
    [[graph]]
        R1 = "a => b => c => d"
[runtime]
    [[a,b,c,d]]
        pre-script = "sleep 2"
    [[a]]
        script = """
            if (( CYLC_TASK_SUBMIT_NUMBER == 1)); then
                cylc hold --flow=2 ${CYLC_WORKFLOW_ID}//1/b
            fi
        """
    [[c]]
        script = """
            if (( CYLC_TASK_SUBMIT_NUMBER == 1)); then
                cylc trigger --flow=new ${CYLC_WORKFLOW_ID}//1/a
            fi
        """
  • at start-up, a puts a future hold on b in flow 2
  • flow 1 runs b and c, which starts flow 2 at a
  • flow 2 gets held at b

hjoliver avatar Aug 17 '23 02:08 hjoliver

One thing to note, on "hold after cycle point" functionality:

This branch makes it flow-specific, in that you can start a new flow and tell just that flow to hold after a given point:

cylc trigger --flow=new ...
cylc hold --after=4 --flow=2

However, there is still only one global hold-after point, whether it is for all tasks or for a single flow, so canceling the hold point doesn't need to be flow-specific:

cylc release --all ...

In future we should probably allow multiple flow-specific hold points, but I don't think that's a high priority. This branch is still a significant enhancment as-is. I expect most demand will be for single-flow re-runs that we don't want to carry on indefinitely (so stop or hold at a future point).

hjoliver avatar Sep 01 '23 23:09 hjoliver

This bad boy is ready for review.

hjoliver avatar Sep 01 '23 23:09 hjoliver

Test coverage good without any functional tests, although that makes me a bit nervous for this sort of thing. Any opinions on that?

hjoliver avatar Sep 07 '23 00:09 hjoliver

With the example, does the 2nd flow get merged into the first flow? image

It ends, after release;

(flow) sutherlander@cortex:cylc-flow$ cylc release --flow=2 fhold/run1//1/b
Done

(confirmed working)

(flow) sutherlander@cortex:cylc-flow$ cylc release --flow=1 fhold/run1//1/b
Done
(flow) sutherlander@cortex:cylc-flow$ cylc release --flow=3 fhold/run1//1/b
Done

(confirmed not working 👍 ) with flow 2 coming out the other end:

{
  "data": {
    "workflows": [
      {
        "id": "~sutherlander/fhold/run1",
        "taskProxies": [
          {
            "id": "~sutherlander/fhold/run1//1/c",
            "state": "succeeded",
            "flowNums": "[1]",
            "isHeld": false
          },
          {
            "id": "~sutherlander/fhold/run1//1/d",
            "state": "running",
            "flowNums": "[2]",
            "isHeld": false
          }
        ],
        "familyProxies": [
          {
            "id": "~sutherlander/fhold/run1//1/root",
            "state": "running"
          }
        ]
      }
    ]
  }
}

Kind of beyond the scope of this PR, but I guess I'll need to look into whether the flow numbers are correct in the data-store (perhaps no-merge is default but I would have expected flow number "[1, 2]" coming out of it otherwise).

Looks good.

dwsutherland avatar Sep 07 '23 02:09 dwsutherland

With the example, does the 2nd flow get merged into the first flow?

No, merging only occurs if the same task from different flows meet in n=0. Here, flow=1 finishes before flow-2 can catch it up:

$ cylc log exa --no-timestamp | grep "=> succeeded" | sed -e 's/^.*-//g'
 [1/a running job:01 flows:1] => succeeded
 [1/b running job:01 flows:1] => succeeded
 [1/c running job:01 flows:1] => succeeded
 [1/a running job:02 flows:2] => succeeded
 [1/d running job:01 flows:1] => succeeded  # <----- flow 1 finished
 [1/b running job:02 flows:2] => succeeded
 [1/c running job:02 flows:2] => succeeded
 [1/d running job:02 flows:2] => succeeded

hjoliver avatar Sep 07 '23 02:09 hjoliver

[Update]: I think this is basically done, I just have to come back to respond to some review comments. It's possible I can do that in time for 8.3.0, let's see.

hjoliver avatar Feb 27 '24 04:02 hjoliver