cylc-uiserver icon indicating copy to clipboard operation
cylc-uiserver copied to clipboard

graphql-core v3 upgrade

Open dwsutherland opened this issue 9 months ago • 7 comments

closes https://github.com/cylc/cylc-uiserver/issues/333 closes https://github.com/cylc/cylc-uiserver/issues/388

Is a sibling to: https://github.com/cylc/cylc-flow/pull/6478 https://github.com/cylc/cylc-ui/pull/2091 (cylc-flow/cylc-uiserver PR needed to be merged at the same time)

Tests will fail until cylc-flow end is in.

Dependencies on:

graphene-tornado==2.6.* graphql-ws==0.4.4

have been removed and/or rewritten as a Cylc port of active projects.

There are some short comings in graphene==3.4.3 with the use of middleware and our bespoke execution context (for null-stripping fields) in subscriptions. So I had to adopt and augment a graphql-core package: https://github.com/graphql-python/graphql-core/blob/v3.2.6/src/graphql/execution/subscribe.py to our own: cylc/uiserver/graphql/subscribe.py

The HEAD of graphql-core has this remedied, but both that and the a corresponding graphene will need to be released. Until then cylc-flow has graphene fixed at 3.4.3, as we'll need to update our code with any future release of graphene/graphql-core (hopefully dropping this workaround).

All aspects appear to be functional:

graphql-core-v3

Check List

  • [x] I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • [x] Contains logically grouped changes (else tidy your branch by rebase).
  • [x] Does not contain off-topic changes (use other PRs for other changes).
  • [x] Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • [ ] Tests are included (or explain why tests are not needed).
  • [x] Changelog entry included if this is a change that can affect users

dwsutherland avatar Mar 10 '25 04:03 dwsutherland

@dwsutherland, at some point, please could you rebase this onto master (pick up recent GH actions changes) and edit this line:

https://github.com/cylc/cylc-uiserver/blob/4e7c9a6454a201190437953449da0f9ed72b2669/.github/workflows/test.yml#L30

To add 3 to the list to unlock testing with the latest Python versions (which these dependency changes will allow!).

oliver-sanders avatar Mar 13 '25 11:03 oliver-sanders

partially addresses #333

I think it closes it?

MetRonnie avatar Mar 18 '25 17:03 MetRonnie

partially addresses #333

I think it closes it?

Well, this in conjunction with it's cylc-flow/cylc-ui siblings

dwsutherland avatar Mar 20 '25 00:03 dwsutherland

FYI: You accidentally force-pushed over some suggestions so will need to re-apply them.

oliver-sanders avatar May 15 '25 12:05 oliver-sanders

I've been getting this traceback shortly after UIS statrup:

Task exception was never retrieved
future: <Task finished name='Task-36' coro=<TornadoSubscriptionServer._process_message() done, defined at ~/cylc-uiserver/cylc/uiserver/graphql/tornado_ws.py:173> exception=UnboundLocalError("local variable 'iterator' referenced before assignment")>
Traceback (most recent call last):
  File "~/cylc-uiserver/cylc/uiserver/graphql/tornado_ws.py", line 189, in _process_message
    return await self.on_start(connection_context, op_id, params)
  File "~/cylc-uiserver/cylc/uiserver/graphql/tornado_ws.py", line 308, in on_start
    await iterator.aclose()
UnboundLocalError: local variable 'iterator' referenced before assignment

The server stays up, but some code isn't being called when it is meant to (e.g. connection_context.unsubscribe(op_id)). This might have something to do with starting a server when you already have a browser tab open?

This does seem to be a code error, I've worked around it like so:

diff --git a/cylc/uiserver/graphql/tornado_ws.py b/cylc/uiserver/graphql/tornado_ws.py
index e42052f..5916d89 100644
--- a/cylc/uiserver/graphql/tornado_ws.py
+++ b/cylc/uiserver/graphql/tornado_ws.py
@@ -286,6 +286,7 @@ class TornadoSubscriptionServer:
 
         params['kwargs']['root_value'] = op_id
         execution_result = await self.execute(params)
+        iterator = None
         try:
             if isawaitable(execution_result):
                 execution_result = await execution_result
@@ -305,7 +306,8 @@ class TornadoSubscriptionServer:
         except Exception as e:
             await self.send_error(connection_context, op_id, e)
         finally:
-            await iterator.aclose()
+            if iterator:
+                await iterator.aclose()
         await self.send_message(connection_context, op_id, GQL_COMPLETE)
         await connection_context.unsubscribe(op_id)
         await self.on_operation_complete(connection_context, op_id)

oliver-sanders avatar May 15 '25 12:05 oliver-sanders

FYI: You accidentally force-pushed over some suggestions so will need to re-apply them.

No idea how that happened, must have push from an old version? .. fixed now

dwsutherland avatar May 16 '25 01:05 dwsutherland

This does seem to be a code error, I've worked around it like so:

Applied the same workaround.. From looking at the code, this could happen when the execution result is an error or something not async iterator.. Not sure what actions this translates to for the user, you may be correct.

dwsutherland avatar May 16 '25 01:05 dwsutherland

@dwsutherland, raised a PR with the requested changes above: https://github.com/dwsutherland/cylc-uiserver/pull/5

oliver-sanders avatar May 20 '25 14:05 oliver-sanders

@dwsutherland, raised a PR with the requested changes above: dwsutherland#5

Thanks, merged

dwsutherland avatar May 21 '25 03:05 dwsutherland

I think the AuthorizationMiddleware needs to be updated to reflect changes in the info object. At present, it looks like the if len(info.path) > 1: return next_(root, info, **args) return is always activated, which has the effect of bypassing the authorisation.

I think I know the differences in info.path that caused this.. I had to deal with this in the other middleware..

dwsutherland avatar May 22 '25 03:05 dwsutherland

Have pushed a fix: image

[I 2025-05-22 21:03:22.650 CylcUIServer] other-suther: not authorized to stop

Although, I haven't been about to setup permissions properly for my other-user .. so maybe something wrong there

Loaded up the hub as root with this in the /root/.cylc/uiserver/jupyter_config.py :

c.Authenticator.allow_all = True

c.CylcUIServer.site_authorization = {
    "*": {  # For all ui-server owners,
        "*": {  # Any authenticated user
            "limit": ["READ", "CONTROL"],
        },
    },
}

c.CylcUIServer.user_authorization = {
    'other-suther': ['READ', 'Poll', 'Play', 'Pause', 'Stop'],
}

c.JupyterHub.load_roles = [
    {
        # allow all authenticated users to access, start and stop
        # each other's servers
        "name": "user",
        "scopes": ["self", "access:servers", "servers"],
    }
]

Then logged in as sutherlander and spun up UIS.. Then logged into a private browser as other-suther .. and went to the UIS

@oliver-sanders - Can you test both that it works and setting privileges does too?

dwsutherland avatar May 22 '25 09:05 dwsutherland

Although, I haven't been about to setup permissions properly for my other-user .

What do you mean by "privileges" - just fine grained authorization of Cylc actions?

hjoliver avatar May 22 '25 23:05 hjoliver

Although, I haven't been about to setup permissions properly for my other-user .

What do you mean by "privileges" - just fine grained authorization of Cylc actions?

Yeah, tried it this morning on my VM with.. it's picking it up, but setting to read:

[I 2025-05-23 14:16:16.406 CylcUIServer] Serving UI from: /home/sutherlander/cylc/cylc-ui/dist
{'other-suther': ['READ', 'Poll', 'Play', 'Pause', 'Stop']}
{'*': {'*': {'limit': ['READ', 'CONTROL']}}}
[I 2025-05-23 14:16:16.408 CylcUIServer] User other-suther authorized permissions: ['read']
{'read'}

just have to dig deeper as to why it's setting it to read

dwsutherland avatar May 23 '25 02:05 dwsutherland

Is it this: https://github.com/cylc/cylc-uiserver/issues/692

hjoliver avatar May 23 '25 02:05 hjoliver

Is it this: #692

Fixed:

https://github.com/cylc/cylc-uiserver/pull/693

dwsutherland avatar May 23 '25 04:05 dwsutherland

Couple of changes accidentally reverted in latest force push (I've unresolved the conversations)

Yeah, I probably didn't fetch these locally

dwsutherland avatar Jun 10 '25 23:06 dwsutherland

Kicking tests

MetRonnie avatar Jun 18 '25 12:06 MetRonnie

Will fix any in the morning (if that's ok).. looks like mypy is failing at min

(although if it's a simple fix, welcome to push)

dwsutherland avatar Jun 18 '25 12:06 dwsutherland

Tests are passing, however, coverage is a bit rubbish with the new code, so I've replaced the ignored cylc/uiserver/websockets/* with cylc/uiserver/graphql/* (although, it's not all foreign (~half) so it should be tested at some point).

We need to add tests that spin up a real UIS (at the moment, they're all mock) so I can add a subscription test(s).. (but perhaps that should be a follow-on PR)

dwsutherland avatar Jun 19 '25 03:06 dwsutherland

so I've replaced the ignored cylc/uiserver/websockets/* with cylc/uiserver/graphql/* (although, it's not all foreign (~half) so it should be tested at some point).

If its "our code" (i.e. we plan to maintain it into the future) it should not be excluded from coverage so we are aware of this coverage hole and fix it in the future.


We need to add tests that spin up a real UIS (at the moment, they're all mock)

This fixture provides a server capable of fulfilling requests:

https://github.com/cylc/cylc-uiserver/blob/89a62e25c2c87d3c9a0a107925cd30dcff2ddc68/cylc/uiserver/tests/conftest.py#L244-L249

oliver-sanders avatar Jun 19 '25 11:06 oliver-sanders

Coverage of untested code reinstated.. Will raise a new PR next week for this

dwsutherland avatar Jun 19 '25 13:06 dwsutherland

Tests appear to be failing because of the new change in master?

File "/home/runner/work/cylc-uiserver/cylc-uiserver/cylc/uiserver/__init__.py", line 20, in <module>
    from cylc.uiserver.app import CylcUIServer
  File "/home/runner/work/cylc-uiserver/cylc-uiserver/cylc/uiserver/app.py", line 90, in <module>
    from cylc.flow.network.graphql import (
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/site-packages/cylc/flow/network/graphql.py", line 41, in <module>
    from cylc.flow.network.schema import NODE_MAP
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/site-packages/cylc/flow/network/schema.py", line 636, in <module>
    class Workflow(ObjectType):
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/site-packages/graphene/types/objecttype.py", line 44, in __new__
    dataclass = make_dataclass(name_, fields, bases=())
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 1230, in make_dataclass
    return dataclass(cls, init=init, repr=repr, eq=eq, order=order,
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 1019, in dataclass
    return wrap(cls)
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 1011, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line [86](https://github.com/cylc/cylc-uiserver/actions/runs/15758596335/job/44419593075?pr=672#step:9:90)1, in _process_class
    cls_fields = [_get_field(cls, name, type)
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 861, in <listcomp>
    cls_fields = [_get_field(cls, name, type)
  File "/home/runner/micromamba/envs/cylc-uiserver/lib/python3.8/dataclasses.py", line 745, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'list'> for field log_records is not allowed: use default_factory

dwsutherland avatar Jun 19 '25 13:06 dwsutherland

gdammit.

Looks like this works under graphql-core v2 but not v3?!

The message sounds like it is complaining about the use of a list as a default value, however, we do the exact same thing with broadcasts just a couple of lines above?!

    broadcasts = GenericScalar(
        ids=graphene.List(
            ID,
            description=sstrip('''
                Node IDs, cycle point and/or-just family/task namespace.

                E.g: `["1234/foo", "1234/FAM", "*/FAM"]`
            '''),
            default_value=[]  # OK?
        ),
        resolver=resolve_broadcasts,
        description='Any active workflow broadcasts.'
    )
    pruned = Boolean()  # TODO: what is this? write description
    n_edge_distance = Int(
        description=sstrip('''
            The maximum graph distance (n) from an active node
            of the data-store graph window.
        '''),
    )
    log_records = graphene.List(
        LogRecord,
        description=sstrip('''
            Scheduler log messages of level WARNING and above.

            Warning: This is an incrementally updated field. Each update
            contains a list of **new** log messages, not a list of all
            messages. It is down to the client to store / preserve previous
            records.
        '''),
        default_value=[],  # Not OK?!
    )

oliver-sanders avatar Jun 19 '25 13:06 oliver-sanders

This seems to do the trick: https://github.com/cylc/cylc-flow/pull/6803

oliver-sanders avatar Jun 19 '25 13:06 oliver-sanders

Integration failure fixed, poking tests...

oliver-sanders avatar Jun 19 '25 15:06 oliver-sanders

The message sounds like it is complaining about the use of a list as a default value, however, we do the exact same thing with broadcasts just a couple of lines above?!

I guess default values on arguments are treated different to those on the fields? Funny thing is, the error is complaining about an immutable type:

ValueError: mutable default <class 'list'> for field log_records is not allowed: use default_factory

and default_factory isn't a graphene option

../../.envs/uis-gql/lib/python3.10/site-packages/graphene/types/argument.py:112: in to_arguments
    raise ValueError(f'Unknown argument "{default_name}".')
E   ValueError: Unknown argument "default_factory".

but if you make it a immutable type:

        default_value=(),

It doesn't complain 😆 (also default_value=None works) .. Maybe a bug? idk (I had to remove similar default_value=[] elsewhere BTW)

However, I think the field default is implicit? (if there is nothing but you ask for the field it will return [] not null) So removing it is probably the right thing to do..

dwsutherland avatar Jun 23 '25 04:06 dwsutherland

I've been trying to get the subscription test working, so far I have got to using:

@pytest.fixture
def gql_sub(jp_ws_fetch):
    """Perform a GraphQL request.
    E.G:
    conn = await gql_sub(
        *('cylc', 'subscriptions'),
        query='''
            subscription {
                workflows {
                    id
                    status
                }
            }
        ''',
    )
    while True:
        msg = await conn.read_message()
        # . . .
    """
    async def _fetch(*endpoint, query=None, headers=None):
        headers = headers or {}
        #headers = {
        #    **headers,
        #    'Content-Type': 'application/json'
        #}
        return await jp_ws_fetch(
            *endpoint,
            headers=headers,
            params={'query': query},
            subprotocols=['graphql-ws'],
        )
    return _fetch
async def test_subscription(gql_sub, dummy_workflow):
    """Test sending the most basic GraphQL subscription."""
    # configure two dummy workflows so we have something to look at
    await dummy_workflow('foo')
    await dummy_workflow('bar')
    # perform the request
    conn = await gql_sub(
        *('cylc', 'subscriptions'),
        query='''
            subscription {
                workflows {
                    id
                    status
                }
            }
        ''',
    )
    print(conn)
    while True:
        msg = await conn.read_message()
        print(msg)
        break
    assert False

However, I must be doing something wrong.. as it hangs on the conn = await gql_sub ... Ref to: https://www.tornadoweb.org/en/stable/_modules/tornado/websocket.html#websocket_connect and https://github.com/jupyter-server/pytest-jupyter/blob/main/pytest_jupyter/jupyter_server.py

maybe I have to do something like:

with await gql_sub(...) as conn:
    ...

Will have another crack at it next week...

dwsutherland avatar Jun 27 '25 08:06 dwsutherland

If this is a new test, maybe remove it from this PR and prepare a new PR to add the tests?

oliver-sanders avatar Jun 27 '25 09:06 oliver-sanders

If this is a new test, maybe remove it from this PR and prepare a new PR to add the tests?

Yes, will do.

dwsutherland avatar Jun 30 '25 09:06 dwsutherland

Added a back compat patch for: https://github.com/cylc/cylc-uiserver/issues/698

image

(justification: https://github.com/cylc/cylc-uiserver/issues/698#issuecomment-3018637188)

dwsutherland avatar Jun 30 '25 10:06 dwsutherland