cylc-flow
cylc-flow copied to clipboard
Skip Mode
Closes https://github.com/cylc/cylc-flow/issues/5641 (also fixes #5975, and fixes #5820 )
This branch includes (marked against skip mode proposal):
- A new task run mode called "skip" to sit alongside the existing "live" and "simulation" and "dummy" modes.
- Skip mode settings in `[runtime][<namespace>][skip].
- Task-level control of the run mode - allowing run mode of tasks to override that of the workflow using
[runtime][<namespace>]run mode. a. Broadcast can changerun modefor future task job submissions. b. Cylc Validate and lint will warn about the setting not being live. cylc set --out skipsets outputs from skip mode.- Tests to ensure that
run mode = skiprespectsis_heldflag. - Tests to ensure that force triggering a task will not override the run mode.
Extras 7. Run Mode is available as an task attribute in the UI 8. When tasks are run in skip mode, the prerequisites which correspond to the outputs they generate should be marked as satisfied by skip mode rather than satisfied naturally for provenance reasons. For the purpose of cylc remove logic, satisfied by skip mode should be treated the same as satisfied naturally.
There are two extensions, which I haven't dealt with yet, because I want to ensure that the basic functionality works, and move to the substantial documentation PR which need follow this.
Check List
- [x] I have read
CONTRIBUTING.mdand added my name as a Code Contributor. - [x] Contains logically grouped changes (else tidy your branch by rebase).
- [x] Does not contain off-topic changes (use other PRs for other changes).
- [x] Applied any dependency changes to both
setup.cfg(andconda-environment.ymlif present). - [x] Tests are included (or explain why tests are not needed).
- [x]
CHANGES.mdentry included if this is a change that can affect users - [x] Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
- [x] Feature: PR against master
Got a change to the schema representation for run mode here: https://github.com/wxtim/cylc/pull/61
Add skip mode to test for cylc validate --mode=sim similar to the tests in https://github.com/cylc/cylc-flow/pull/6213
@wxtim, what's the status here?
What I've done is ready to review, but....
- [x] Need to review https://github.com/wxtim/cylc/pull/61 - I need to review
- [x] Add test in https://github.com/cylc/cylc-flow/pull/6213
- [x] check https://github.com/cylc/cylc-flow/issues/6189
- Not Done
@wxtim, according to the OP, this does not close #5641. If that's out of date, just amend the OP, if you're planning to do it in a follow-on, remove the linked issue and stick a comment on it so we know what the status is.
- Not Done
@wxtim, according to the OP, this does not close #5641. If that's out of date, just amend the OP, if you're planning to do it in a follow-on, remove the linked issue and stick a comment on it so we know what the status is.
I've done it, but in the process of checking that I've done it I spotted an undesirable failure to document behaviour which led me down a rabbit hole leading to fix https://github.com/cylc/cylc-flow/pull/6039/commits/380abccc3caf6cf7019b5912d0533bf2a5b62fa1.
Also
try_numis 0 instead of 1 to begin with
I quick check against master suggests that's not true of the db
(FYI bullet point [8] is blank in the PR description)
Have started functional testing, looking good so far, a couple of small issues:
1) Cannot broadcast [skip] settings
E.G. This command should configure a task to run in skip mode AND fail when it does so:
$ cylc broadcast -s 'run mode = skip' -s '[skip]outputs = failed'
However, the [skip] part seems to be ignored. Not a biggie from a functionality perspective (why would anyone want to do this!), but if we're allowing users to do this it should work.
2) The started output isn't being produced?
This example stalls for me:
a? & a:started => b
[runtime]
[[root]]
run mode = skip
3) Skip mode messes with the average task runtime
Skip mode tasks are not excluded from the average task runtime calculation which messes it up a fair bit.
Not a biggie, can be moved to a follow-on issue.
4) Skip mode messes with the analysis view
Skip mode tasks are not excluded from the analysis view statistics which is a tad confusing.
Not a biggie, can be punted into a small issue for the data workflows team. Just make sure that it is easy to filter out skip mode tasks in the task_jobs DB table (i.e. by filtering by platform name).
Opened: https://github.com/cylc/cylc-uiserver/issues/626
- Cannot broadcast [skip] settings
Fixed by https://github.com/cylc/cylc-flow/pull/6039/commits/f2fb60e005007efb47f6461d18ff3d88e3aa1e47. It turned out to be a bit of a rabbit hole with disable task event handlers. I can see wanting to change this as being a reason to broadcast skip mode settings.
- The started output isn't being produced?
No, but I don't think it's my change at fault - Since it appears to exist as an entirely separate bug present in Cylc as far back as 8.0.0! I'll attempt to fix anyway!
Proposed fix up at https://github.com/cylc/cylc-flow/pull/6351 - since it's a bug. Will rebase and fix after.
- Skip mode messes with the average task runtime
Fixed https://github.com/cylc/cylc-flow/pull/6039/commits/660ca3a7c56c652e38c6047adcd564925dca6e54
I have noticed that dummy.py looks like
Inspecting a workflow with "skip mode" tasks using Cylc Review revealed a couple of minor things:
- The "job #" box says "1 of 0" whereas it would say "1 of 1" for tasks run in "live" or "simulation" mode.
- This maybe suggests that something isn't being incremented or written to the DB correctly.
- The "job batch" box says "simulation" for "skip" mode tasks.
- I'm not sure why that is, it might be a hardcoded value in Cylc Review (if so no worries).
The "job #" box says "1 of 0" whereas it would say "1 of 1" for tasks run in "live" or "simulation" mode.
- This maybe suggests that something isn't being incremented or written to the DB correctly.
This appears to be the case - neither of the "nonlive" modes were updating the task_states table. Added, along with a wee test.
- The "job batch" box says "simulation" for "skip" mode tasks.
This is an easy fix, and I'll push it up.
Having a play around with this branch with some imagined use cases.
Unfortunately, I've encountered one critical traceback:
Traceback (most recent call last):
File "~/cylc/flow/scheduler.py", line 652, in run_scheduler
await self._main_loop()
File "~/cylc/flow/scheduler.py", line 1520, in _main_loop
self.release_queued_tasks()
File "~/cylc/flow/scheduler.py", line 1267, in release_queued_tasks
for itask in self.task_job_mgr.submit_task_jobs(
File "~/cylc/flow/task_job_mgr.py", line 266, in submit_task_jobs
itasks, submitted_nonlive_tasks = self.submit_nonlive_task_jobs(
File "~/cylc/flow/task_job_mgr.py", line 1080, in submit_nonlive_task_jobs
nonlive_mode = submit_func(
File "~/cylc/flow/run_modes/skip.py", line 50, in submit_task_job
task_job_mgr._set_retry_timers(itask, rtconfig)
File "~/cylc/flow/task_job_mgr.py", line 1011, in _set_retry_timers
or itask.platform['submission retry delays']
KeyError: 'submission retry delays'
I had configured the tasks to run in skip-mode and to generate the failed output using a broadcast. I then made another broadcast to change the skip-mode outputs and re-triggered the tasks.
I had configured the tasks to run in skip-mode and to generate the failed output using a broadcast. I then made another broadcast to change the skip-mode outputs and re-triggered the tasks.
I've replicated this :(
Poking @MetRonnie because Oliver Hath Approved!
(Will get round to it after the Cylc/Rose training)
Latest commit seems to include some kind of squashed merge of master into this branch?
Latest commit seems to include some kind of squashed merge of master into this branch?
Yes - I made the history horrible, and I think this is the best I can do to clean it up.
@MetRonnie & @oliver-sanders have a look at https://github.com/cylc/cylc-flow/pull/6039/commits/bc8d067b24f08770aaaf94eb0c420cd0c9e88fa5. I'm not convinced it's a good idea, but if you think it might be I'll add tests. But IMO mission creep.
https://github.com/MetRonnie/cylc-flow/tree/skip-mode-git-rescue is identical to 047c76fdb1adbb3ae7d6f65987299eecf7bfd9a3 but with a proper history (took your previous head commit and dropped the (empty) merge commit named "merge" which was possibly a case of merging a local branch into an identical but rebased branch).
To avoid this getting lost above: https://github.com/cylc/cylc-flow/pull/6039#discussion_r1822975115
Getting a warning when setting the skip output on a task (that does not have any custom outputs defined)
INFO - Command "set" received. ID=c0fa6e99-387b-409f-9755-e14596cdee40
set(flow=[], flow_wait=False, outputs=['skip'], prerequisites=[], tasks=['1/foo'])
WARNING - output 1/foo:skip not found
INFO - [1/foo:waiting(queued)] => succeeded(queued)
INFO - [1/foo/00:succeeded(queued)] => succeeded
Getting a warning when setting the
skipoutput on a task (that does not have any custom outputs defined)
Fixed in https://github.com/cylc/cylc-flow/pull/6039/commits/8136da67a96538e79653852ceb3c8c1eff7c5638
I think this is pretty much good to go 🎉 Just got a conflict to resolve + my suggestion above
I've got a deconfliction here: https://github.com/wxtim/cylc/pull/70
I've checked it carefully, should be correct.
:tada:
Congrats Tim!