jupyter_server Support GraphQL

See https://github.com/jupyterlab/jupyterlab/issues/11789

Problem

JupyterLab constantly polls the server to retrieve information about:

files and directories (contents API)
running terminals (terminals API)
running notebooks (sessions API)
running kernels (kernels API)

It's not optimal because it might:

make useless requests (when there is no update)
make requests long after the update happened
get back more information than needed

Proposed Solution

Would it make sense to support a GraphQL API? I remember there has been some work on this already, but I can't find it.

Jan 05 '22 17:01 davidbrochart

AFAIK @saulshanabrook did all of that work on his rtc branch

Jan 05 '22 17:01 blink1073

@davidbrochart The demo coding I was working on is in https://github.com/saulshanabrook/rtc/tree/graphql/packages/jupyter-graphql It was working for a subset of the Jupyter server and used subscriptions to push the analysis of kernel messages to the server, keeping the state there.

I stopped working on it due to pressure to prioritize a working RTC implementation.

Jan 05 '22 21:01 saulshanabrook

The client having to pull those information is really something we should move away from in favor of something more event-based pushed from the server.

If GraphQL can bring that, it would be wonderful. This should be an addition to all the existing APIs, not a replacement to ensure backwards compatibility.

I stopped working on it due to pressure to prioritize a working RTC implementation.

@saulshanabrook Yeah, we had discussed that. My understanding is that GraphQL still makes sense even without the RTC aspects which is now implemented via CRDT. But from what I see, not all aspects of RTC needs should/could be covered by CRDT which is focussed on a pure documents. e.g. The RTC event "Open a notebook" could be fulfilled by GraphQL... ? (just thinking loud)

Jan 06 '22 06:01 echarles

Thanks @blink1073, @saulshanabrook and @echarles for the feedback. I think GraphQL is not only helpful for handling notifications pushed from the server (and removing the polling from the client), but also to give clients more flexibility as to which information they request. If it can be useful for RTC, that would be another reason to support it, but I'm not sure how. I will look at Saul's work to try and have a better understanding. Maybe Jupyverse would be a good place to start experimenting with a GraphQL API, because FastAPI makes it easy to use any ASGI-compatible GraphQL library. I don't know if it's as easy with Tornado.

Jan 06 '22 08:01 davidbrochart

Maybe Jupyverse would be a good place to start experimenting with a GraphQL API, because FastAPI makes it easy to use any ASGI-compatible GraphQL library. I don't know if it's as easy with Tornado.

As @saulshanabrook pointed out, GraphQL on Tornado is already implemented on https://github.com/saulshanabrook/rtc/tree/graphql/packages/jupyter-graphql

I would favor experimenting on top of the existing Jupyter Server instead of Jupyverse to deliver value as soon as possible to existing Jupyter Server frontends.

Jan 06 '22 08:01 echarles

Yeah the implementation I was working on works as an extension on top of Jupyter Server, which allows clients to connect either with the existing endpoints or by using GraphQL. It uses the same in memory data structures as the server, to allow both simultaneously.

See for example https://github.com/saulshanabrook/rtc/blob/graphql/packages/jupyter-graphql/jupyter_graphql/jupyter_server_extension.py which adds a Jupyter Server extension, for graphql as well as the grpahql playground.

The Services class is what takes the jupyter server services and adds listeners to keep its own structures.

I gave a demo of the working code in an RTC meeting a while ago: https://youtu.be/fRlVawMDVMk?t=608

Jan 06 '22 17:01 saulshanabrook

Hey, y'all! Hooray GraphQL!

With another jupyter-graphql, we got up to some fairly interesting demos. I particularly liked:

integration with graphql-voyager: having an accurate, well-typed schema that happens to generate interactive documentation is :heart_on_fire:.
wrap nbconvert in a subscription so you could emit a live-updating view of a rendered notebook.

If i was doing it again, I would not use the graphene ORM magic, but instead ariadne, as @saulshanabrook did, or tartiflette... whichever seemed more robust/maintained/extensible. As they are both schema-driven, it would be relatively straightforward to do a bakeoff. And the schema part is the big win, as it mostly avoids things like #518. Indeed, the types that come of GraphQL are about as expressive as TypeScript, and beyond JSON schema... certainly robust enough to generate either... or a bunch of other things.

At the time, extensible GraphQL schema wasn't really A Thing, but now that schema federation is more well-defined, I'd probably lean towards that. The magic here would be the ability to reuse core Jupyter types on top of other GraphQL-enabled apps such as gitlab or dagster.

In addition, there's also some of @rgbkrk's work on some node-based stuff.

ASGI-compatible

It's great for python to define a semi-formalized thing, and indeed, I feel like adopting the ASGI model would be a step forward rather than requiring tornado or FastAPI... but the long con of Jupyter infrastructure can't be python-only. Getting things like #518 under control so folk could really explore alternate high-performance (or lower-resource) implementations would pay off handsomly.

easy with Tornado.

It's entirely possible to shoehorn an ASGI app in-loop with tornado. I think this is critical for an extensible system that can also take advantage of all of the existing (and future) services a jupyter server + extensions might provide.

Jan 15 '22 18:01 bollwyvl

jupyter_server jupyter_server copied to clipboard

Support GraphQL

Problem

Proposed Solution

jupyter_server
jupyter_server copied to clipboard