graphql-core v3 upgrade
closes https://github.com/cylc/cylc-uiserver/issues/333 closes https://github.com/cylc/cylc-uiserver/issues/388
Is a sibling to: https://github.com/cylc/cylc-flow/pull/6478 https://github.com/cylc/cylc-ui/pull/2091 (cylc-flow/cylc-uiserver PR needed to be merged at the same time)
Tests will fail until cylc-flow end is in.
Dependencies on:
graphene-tornado==2.6.*
graphql-ws==0.4.4
have been removed and/or rewritten as a Cylc port of active projects.
There are some short comings in graphene==3.4.3 with the use of middleware and our bespoke execution context (for null-stripping fields) in subscriptions. So I had to adopt and augment a graphql-core package:
https://github.com/graphql-python/graphql-core/blob/v3.2.6/src/graphql/execution/subscribe.py
to our own:
cylc/uiserver/graphql/subscribe.py
The HEAD of graphql-core has this remedied, but both that and the a corresponding graphene will need to be released.
Until then cylc-flow has graphene fixed at 3.4.3, as we'll need to update our code with any future release of graphene/graphql-core (hopefully dropping this workaround).
All aspects appear to be functional:
Check List
- [x] I have read
CONTRIBUTING.mdand added my name as a Code Contributor. - [x] Contains logically grouped changes (else tidy your branch by rebase).
- [x] Does not contain off-topic changes (use other PRs for other changes).
- [x] Applied any dependency changes to both
setup.cfg(andconda-environment.ymlif present). - [ ] Tests are included (or explain why tests are not needed).
- [x] Changelog entry included if this is a change that can affect users
@dwsutherland, at some point, please could you rebase this onto master (pick up recent GH actions changes) and edit this line:
https://github.com/cylc/cylc-uiserver/blob/4e7c9a6454a201190437953449da0f9ed72b2669/.github/workflows/test.yml#L30
To add 3 to the list to unlock testing with the latest Python versions (which these dependency changes will allow!).
partially addresses #333
I think it closes it?
partially addresses #333
I think it closes it?
Well, this in conjunction with it's cylc-flow/cylc-ui siblings
FYI: You accidentally force-pushed over some suggestions so will need to re-apply them.
I've been getting this traceback shortly after UIS statrup:
Task exception was never retrieved
future: <Task finished name='Task-36' coro=<TornadoSubscriptionServer._process_message() done, defined at ~/cylc-uiserver/cylc/uiserver/graphql/tornado_ws.py:173> exception=UnboundLocalError("local variable 'iterator' referenced before assignment")>
Traceback (most recent call last):
File "~/cylc-uiserver/cylc/uiserver/graphql/tornado_ws.py", line 189, in _process_message
return await self.on_start(connection_context, op_id, params)
File "~/cylc-uiserver/cylc/uiserver/graphql/tornado_ws.py", line 308, in on_start
await iterator.aclose()
UnboundLocalError: local variable 'iterator' referenced before assignment
The server stays up, but some code isn't being called when it is meant to (e.g. connection_context.unsubscribe(op_id)). This might have something to do with starting a server when you already have a browser tab open?
This does seem to be a code error, I've worked around it like so:
diff --git a/cylc/uiserver/graphql/tornado_ws.py b/cylc/uiserver/graphql/tornado_ws.py
index e42052f..5916d89 100644
--- a/cylc/uiserver/graphql/tornado_ws.py
+++ b/cylc/uiserver/graphql/tornado_ws.py
@@ -286,6 +286,7 @@ class TornadoSubscriptionServer:
params['kwargs']['root_value'] = op_id
execution_result = await self.execute(params)
+ iterator = None
try:
if isawaitable(execution_result):
execution_result = await execution_result
@@ -305,7 +306,8 @@ class TornadoSubscriptionServer:
except Exception as e:
await self.send_error(connection_context, op_id, e)
finally:
- await iterator.aclose()
+ if iterator:
+ await iterator.aclose()
await self.send_message(connection_context, op_id, GQL_COMPLETE)
await connection_context.unsubscribe(op_id)
await self.on_operation_complete(connection_context, op_id)
FYI: You accidentally force-pushed over some suggestions so will need to re-apply them.
No idea how that happened, must have push from an old version? .. fixed now
This does seem to be a code error, I've worked around it like so:
Applied the same workaround.. From looking at the code, this could happen when the execution result is an error or something not async iterator.. Not sure what actions this translates to for the user, you may be correct.
@dwsutherland, raised a PR with the requested changes above: https://github.com/dwsutherland/cylc-uiserver/pull/5
I think the AuthorizationMiddleware needs to be updated to reflect changes in the info object. At present, it looks like the if len(info.path) > 1: return next_(root, info, **args) return is always activated, which has the effect of bypassing the authorisation.
I think I know the differences in info.path that caused this.. I had to deal with this in the other middleware..
Have pushed a fix:
[I 2025-05-22 21:03:22.650 CylcUIServer] other-suther: not authorized to stop
Although, I haven't been about to setup permissions properly for my other-user .. so maybe something wrong there
Loaded up the hub as root with this in the /root/.cylc/uiserver/jupyter_config.py :
c.Authenticator.allow_all = True
c.CylcUIServer.site_authorization = {
"*": { # For all ui-server owners,
"*": { # Any authenticated user
"limit": ["READ", "CONTROL"],
},
},
}
c.CylcUIServer.user_authorization = {
'other-suther': ['READ', 'Poll', 'Play', 'Pause', 'Stop'],
}
c.JupyterHub.load_roles = [
{
# allow all authenticated users to access, start and stop
# each other's servers
"name": "user",
"scopes": ["self", "access:servers", "servers"],
}
]
Then logged in as sutherlander and spun up UIS..
Then logged into a private browser as other-suther .. and went to the UIS
@oliver-sanders - Can you test both that it works and setting privileges does too?
Although, I haven't been about to setup permissions properly for my other-user .
What do you mean by "privileges" - just fine grained authorization of Cylc actions?
Although, I haven't been about to setup permissions properly for my other-user .
What do you mean by "privileges" - just fine grained authorization of Cylc actions?
Yeah, tried it this morning on my VM with.. it's picking it up, but setting to read:
[I 2025-05-23 14:16:16.406 CylcUIServer] Serving UI from: /home/sutherlander/cylc/cylc-ui/dist
{'other-suther': ['READ', 'Poll', 'Play', 'Pause', 'Stop']}
{'*': {'*': {'limit': ['READ', 'CONTROL']}}}
[I 2025-05-23 14:16:16.408 CylcUIServer] User other-suther authorized permissions: ['read']
{'read'}
just have to dig deeper as to why it's setting it to read
Is it this: https://github.com/cylc/cylc-uiserver/issues/692
Is it this: #692
Fixed:
https://github.com/cylc/cylc-uiserver/pull/693
Couple of changes accidentally reverted in latest force push (I've unresolved the conversations)
Yeah, I probably didn't fetch these locally
Kicking tests
Will fix any in the morning (if that's ok).. looks like mypy is failing at min
(although if it's a simple fix, welcome to push)
Tests are passing, however, coverage is a bit rubbish with the new code, so I've replaced the ignored cylc/uiserver/websockets/* with cylc/uiserver/graphql/*
(although, it's not all foreign (~half) so it should be tested at some point).
We need to add tests that spin up a real UIS (at the moment, they're all mock) so I can add a subscription test(s).. (but perhaps that should be a follow-on PR)
so I've replaced the ignored cylc/uiserver/websockets/* with cylc/uiserver/graphql/* (although, it's not all foreign (~half) so it should be tested at some point).
If its "our code" (i.e. we plan to maintain it into the future) it should not be excluded from coverage so we are aware of this coverage hole and fix it in the future.
We need to add tests that spin up a real UIS (at the moment, they're all mock)
This fixture provides a server capable of fulfilling requests:
https://github.com/cylc/cylc-uiserver/blob/89a62e25c2c87d3c9a0a107925cd30dcff2ddc68/cylc/uiserver/tests/conftest.py#L244-L249
Coverage of untested code reinstated.. Will raise a new PR next week for this
Tests appear to be failing because of the new change in master?
File "/home/runner/work/cylc-uiserver/cylc-uiserver/cylc/uiserver/__init__.py", line 20, in <module>
from cylc.uiserver.app import CylcUIServer
File "/home/runner/work/cylc-uiserver/cylc-uiserver/cylc/uiserver/app.py", line 90, in <module>
from cylc.flow.network.graphql import (
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/site-packages/cylc/flow/network/graphql.py", line 41, in <module>
from cylc.flow.network.schema import NODE_MAP
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/site-packages/cylc/flow/network/schema.py", line 636, in <module>
class Workflow(ObjectType):
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/site-packages/graphene/types/objecttype.py", line 44, in __new__
dataclass = make_dataclass(name_, fields, bases=())
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 1230, in make_dataclass
return dataclass(cls, init=init, repr=repr, eq=eq, order=order,
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 1019, in dataclass
return wrap(cls)
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 1011, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line [86](https://github.com/cylc/cylc-uiserver/actions/runs/15758596335/job/44419593075?pr=672#step:9:90)1, in _process_class
cls_fields = [_get_field(cls, name, type)
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 861, in <listcomp>
cls_fields = [_get_field(cls, name, type)
File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 745, in _get_field
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'list'> for field log_records is not allowed: use default_factory
gdammit.
Looks like this works under graphql-core v2 but not v3?!
The message sounds like it is complaining about the use of a list as a default value, however, we do the exact same thing with broadcasts just a couple of lines above?!
broadcasts = GenericScalar(
ids=graphene.List(
ID,
description=sstrip('''
Node IDs, cycle point and/or-just family/task namespace.
E.g: `["1234/foo", "1234/FAM", "*/FAM"]`
'''),
default_value=[] # OK?
),
resolver=resolve_broadcasts,
description='Any active workflow broadcasts.'
)
pruned = Boolean() # TODO: what is this? write description
n_edge_distance = Int(
description=sstrip('''
The maximum graph distance (n) from an active node
of the data-store graph window.
'''),
)
log_records = graphene.List(
LogRecord,
description=sstrip('''
Scheduler log messages of level WARNING and above.
Warning: This is an incrementally updated field. Each update
contains a list of **new** log messages, not a list of all
messages. It is down to the client to store / preserve previous
records.
'''),
default_value=[], # Not OK?!
)
This seems to do the trick: https://github.com/cylc/cylc-flow/pull/6803
Integration failure fixed, poking tests...
The message sounds like it is complaining about the use of a list as a default value, however, we do the exact same thing with broadcasts just a couple of lines above?!
I guess default values on arguments are treated different to those on the fields? Funny thing is, the error is complaining about an immutable type:
ValueError: mutable default <class 'list'> for field log_records is not allowed: use default_factory
and default_factory isn't a graphene option
../../.envs/uis-gql/lib/python3.10/site-packages/graphene/types/argument.py:112: in to_arguments
raise ValueError(f'Unknown argument "{default_name}".')
E ValueError: Unknown argument "default_factory".
but if you make it a immutable type:
default_value=(),
It doesn't complain 😆 (also default_value=None works) .. Maybe a bug? idk
(I had to remove similar default_value=[] elsewhere BTW)
However, I think the field default is implicit? (if there is nothing but you ask for the field it will return [] not null)
So removing it is probably the right thing to do..
I've been trying to get the subscription test working, so far I have got to using:
@pytest.fixture
def gql_sub(jp_ws_fetch):
"""Perform a GraphQL request.
E.G:
conn = await gql_sub(
*('cylc', 'subscriptions'),
query='''
subscription {
workflows {
id
status
}
}
''',
)
while True:
msg = await conn.read_message()
# . . .
"""
async def _fetch(*endpoint, query=None, headers=None):
headers = headers or {}
#headers = {
# **headers,
# 'Content-Type': 'application/json'
#}
return await jp_ws_fetch(
*endpoint,
headers=headers,
params={'query': query},
subprotocols=['graphql-ws'],
)
return _fetch
async def test_subscription(gql_sub, dummy_workflow):
"""Test sending the most basic GraphQL subscription."""
# configure two dummy workflows so we have something to look at
await dummy_workflow('foo')
await dummy_workflow('bar')
# perform the request
conn = await gql_sub(
*('cylc', 'subscriptions'),
query='''
subscription {
workflows {
id
status
}
}
''',
)
print(conn)
while True:
msg = await conn.read_message()
print(msg)
break
assert False
However, I must be doing something wrong.. as it hangs on the conn = await gql_sub ... Ref to:
https://www.tornadoweb.org/en/stable/_modules/tornado/websocket.html#websocket_connect
and
https://github.com/jupyter-server/pytest-jupyter/blob/main/pytest_jupyter/jupyter_server.py
maybe I have to do something like:
with await gql_sub(...) as conn:
...
Will have another crack at it next week...
If this is a new test, maybe remove it from this PR and prepare a new PR to add the tests?
If this is a new test, maybe remove it from this PR and prepare a new PR to add the tests?
Yes, will do.
Added a back compat patch for: https://github.com/cylc/cylc-uiserver/issues/698
(justification: https://github.com/cylc/cylc-uiserver/issues/698#issuecomment-3018637188)