cylc-flow icon indicating copy to clipboard operation
cylc-flow copied to clipboard

Double-reload if command in job script

Open hjoliver opened this issue 2 years ago • 4 comments

Bug introduced by this change: https://github.com/cylc/cylc-flow/pull/5592

[scheduling]
    [[graph]]
        R1 = foo
[runtime]
    [[foo]]
        script = cylc reload $CYLC_WORKFLOW_ID

This results in two calls to the reload mutation, instead of one, and repeats the task status messages as well.

$ cylc log bug | grep received
2023-12-06T14:26:19+13:00 DEBUG - [1/foo submitted job:01 flows:1] (received)started
2023-12-06T14:26:20+13:00 DEBUG - [1/foo running job:01 flows:1] (received)succeeded
2023-12-06T14:26:20+13:00 DEBUG - [1/foo succeeded job:01 flows:1] (received)started
2023-12-06T14:26:20+13:00 DEBUG - [1/foo succeeded job:01 flows:1] (received)succeeded

$ cylc log bug | grep reload
2023-12-06T14:26:19+13:00 INFO - [command] reload_workflow
2023-12-06T14:26:20+13:00 INFO - [command] reload_workflow
2023-12-06T14:26:20+13:00 INFO - [1/foo running job:01 flows:1] reloaded task definition
2023-12-06T14:26:20+13:00 WARNING - [1/foo running job:01 flows:1] active with pre-reload settings
2023-12-06T14:26:20+13:00 INFO - Command actioned: reload_workflow()
2023-12-06T14:26:20+13:00 INFO - [1/foo running job:01 flows:1] reloaded task definition
2023-12-06T14:26:20+13:00 WARNING - [1/foo running job:01 flows:1] active with pre-reload settings
2023-12-06T14:26:20+13:00 INFO - Command actioned: reload_workflow()

hjoliver avatar Dec 06 '23 01:12 hjoliver

This new block is the cause, but I haven't figured out exactly what's going on:

https://github.com/cylc/cylc-flow/blob/a965b918a57e89e9e79119a59cfce2686c84674e/cylc/flow/scheduler.py#L1089-L1112

hjoliver avatar Dec 06 '23 01:12 hjoliver

Ah, some progress. It's because the same job gets submitted twice! (Not double-handling of queued commands during reload as I initially thought).

It only happens if the task state hasn't updated yet, after job submission. So not a very serious bug after all, but still a bug.

Workaround:

script = """
    cylc__job__wait_cylc_message_started
    cylc reload $CYLC_WORKFLOW_ID
"""

hjoliver avatar Dec 06 '23 01:12 hjoliver

It only happens if the task state hasn't updated yet

hmm, fishy. What state was the task in at this point? preparing or submitted?

oliver-sanders avatar Dec 06 '23 09:12 oliver-sanders

I can't reproduce with your original example

INFO - [1/foo preparing job:01 flows:1] => submitted
INFO - [command] reload_workflow
INFO - PAUSING the workflow now: Reloading workflow
INFO - Reloading the workflow definition.
INFO - LOADING workflow parameters
INFO - + workflow UUID = d5194cff-9f4d-4400-975c-3b36de1023e3
INFO - + UTC mode = False
INFO - + cycle point time zone = Z
INFO - + paused = True
INFO - Reloading task definitions.
INFO - [1/foo submitted job:01 flows:1] reloaded task definition
WARNING - [1/foo submitted job:01 flows:1] active with pre-reload settings
INFO - LOADING job data
INFO - Reload completed.
INFO - RESUMING the workflow now
INFO - Command actioned: reload_workflow()
INFO - [1/foo submitted job:01 flows:1] => running
INFO - [1/foo running job:01 flows:1] => succeeded
INFO - Workflow shutting down - AUTOMATIC

MetRonnie avatar Jan 23 '24 15:01 MetRonnie