data store: remove threading
Closes #194
Run subscribers via asyncio rather than the ThreadPoolExecutor.
I think the only non-blocking operations in the code being called were:
- time.sleep (has an async variant)
- asyncio.sleep (async)
- self.socket.recv_multipart (async)
If so, I don't think we need to use threading here so I've refactored the code so that the underlying async functions could be called via asyncio.
If so, this cuts out the need for the ThreadPoolExecutor removing the workflow limit.
Check List
- [x] I have read
CONTRIBUTING.mdand added my name as a Code Contributor. - [x] Contains logically grouped changes (else tidy your branch by rebase).
- [x] Does not contain off-topic changes (use other PRs for other changes).
- [x] Applied any dependency changes to both
setup.cfg(andconda-environment.ymlif present). - [x] Tests are included (or explain why tests are not needed).
- [x]
CHANGES.mdentry included if this is a change that can affect users - [x] Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
- [x] If this is a bug fix, PR should be raised against the relevant
?.?.xbranch.
Nice, hope we can get this working 🚀
Ping @dwsutherland, I may well have overlooked something, does this make sense to you?
(note, tests are unhappy for async reasons, but it should work for real usage)
Ping @dwsutherland, I may well have overlooked something, does this make sense to you?
(note, tests are unhappy for async reasons, but it should work for real usage)
Will have a look, I think I wanted to do this when I was first building, but didn't manage to figure out how to have non-blocking subscription receive loops or something.. Can't remember exactly why (wasn't as simple as the sleeps..), but well done if you've figured it out 👏
The changes look quite simple .. had a play, and I did notice something I couldn't reproduce on master.
When stopping a workflow in the WUI (latest master cylc-ui):
[I 2024-04-09 21:37:20.380 CylcUIServer] $ cylc play --color=never --mode live five/run1
[I 2024-04-09 21:37:20.901 CylcHubApp log:191] 200 POST /user/sutherlander/cylc/graphql (sutherlander@::ffff:127.0.0.1) 530.46ms
[I 2024-04-09 21:37:20.906 CylcUIServer] [data-store] connect_workflow('~sutherlander/five/run1', <dict>)
[I 2024-04-09 21:37:30.551 CylcHubApp log:191] 200 POST /user/sutherlander/cylc/graphql (sutherlander@::ffff:127.0.0.1) 99.47ms
[I 2024-04-09 21:37:30.943 CylcUIServer] [data-store] disconnect_workflow('~sutherlander/holder/run1')
21:37:31.535 [ConfigProxy] error: 503 GET /user/sutherlander/cylc/subscriptions connect ECONNREFUSED 127.0.0.1:33077
[I 2024-04-09T21:37:31.541 JupyterHub log:191] 200 GET /hub/error/503?url=%2Fuser%2Fsutherlander%2Fcylc%2Fsubscriptions (@127.0.0.1) 5.35ms
21:37:31.894 [ConfigProxy] error: 503 GET /user/sutherlander/cylc/subscriptions connect ECONNREFUSED 127.0.0.1:33077
.
.
.
[I 2024-04-09T21:37:34.955 JupyterHub log:191] 200 GET /hub/error/503?url=%2Fuser%2Fsutherlander%2Fcylc%2Fsubscriptions (@127.0.0.1) 1.33ms
21:37:37.405 [ConfigProxy] error: 503 GET /user/sutherlander/cylc/subscriptions connect ECONNREFUSED 127.0.0.1:33077
[I 2024-04-09T21:37:37.407 JupyterHub log:191] 200 GET /hub/error/503?url=%2Fuser%2Fsutherlander%2Fcylc%2Fsubscriptions (@127.0.0.1) 1.35ms
[W 2024-04-09T21:37:39.146 JupyterHub base:1154] User sutherlander server stopped, with exit code: 0
[I 2024-04-09T21:37:39.146 JupyterHub proxy:356] Removing user sutherlander from proxy (/user/sutherlander/)
Not sure why.. could be something unrelated? maybe this branch needs rebased..
Looks like it might be trying to reconnect to the disconnected workflow?
Ah, think I might know why, maybe .. the subscriber resolver (cylc-flow) might use that w_subs variable, which you changed to subscribers but workflow_subscribers would have been better I think
Does it make sense to deal with this one: https://github.com/cylc/cylc-uiserver/issues/584 here, at the same time?
Can move away from Tornado async interfaces in due course (these are already translated into asyncio calls by the tornado library), Jupyter Server hasn't moved over yet so we may need to wait a while before changing ourselves.