jupyter_server icon indicating copy to clipboard operation
jupyter_server copied to clipboard

Prefix kernel ids

Open jtpio opened this issue 4 years ago • 11 comments

Hey folks,

Is it be possible to use custom kernel ids in an ExtensionApp, when starting a new kernel with self.kernel_manager.start_kernel()?

This seems to be supported in the base MultiKernelManager class in jupyter_client:

https://github.com/jupyter/jupyter_client/blob/15b3b0566d962bbc042d21b858caeca2e870209a/jupyter_client/multikernelmanager.py#L130

However jupyter_server's MappingKernelManager handles start_kernel a bit differently:

https://github.com/jupyter/jupyter_server/blob/edc9dc211adbe27106a73b7f65989d2b0bdb604b/jupyter_server/services/kernels/kernelmanager.py#L168-L190

And will raise an exception when a new kernel_id is specified here:

https://github.com/jupyter/jupyter_server/blob/edc9dc211adbe27106a73b7f65989d2b0bdb604b/jupyter_server/services/kernels/kernelmanager.py#L385-L386

I'm currently testing this with the simple_ext1 extension, by adding the following code snippet to start new kernels on each request:

base_kernel_id = self.kernel_manager.new_kernel_id()
kernel_id = await self.kernel_manager.start_kernel(kernel_id=f'prefix-{base_kernel_id}', kernel_name='python3', path='.')

To: https://github.com/jupyter/jupyter_server/blob/edc9dc211adbe27106a73b7f65989d2b0bdb604b/examples/simple/simple_ext1/handlers.py#L5-L11

jtpio avatar May 28 '20 13:05 jtpio

hi @jtpio. The format of a kernel_id should be opaque. Since they have always been a UUID (at least in my understanding), it might be difficult to determine if that's actually the case, but a prefix of the sort you're describing should be okay.

The jupyter_server behavior is (currently) directly tied to the notebook server's behavior and I suspect the specification of a kernel_id in this manner is solely to support the ability for multiple notebooks to use the same kernel. Why something like "attach" or "share" wasn't implemented to accomplish this, I don't know.

Enterprise Gateway also has a requirement to bring your own kernel_id but accomplishes this via the fact that its callers, in this case, are REST clients and uses the JSON body to convey the custom kernel_id and EG implements new_kernel_id() on its subclasses of the kernel managers. However, even in that case, we enforce UUID compatibility (primarily for fear of the unknown relative to how other components of the framework will behave).

Since juptyer_server has MappingKernelManager, I wouldn't be opposed to having an argument (other than kernel_id since it's semantics are "attach"), say new_kernel_id for example, that is checked in an override of new_kernel_id() that then uses the new argument, if it exists.

My bigger concern is the unknown relative to format, but I suspect that should be okay since UUIDs are rarely interpreted (nor should be).

kevin-bates avatar May 28 '20 13:05 kevin-bates

Thanks @kevin-bates for the quick reply!

The main motivation for now would be to differentiate between the kernels started from a notebook frontend using the default handler from the notebook / jupyter server, and the ones started in a separate handler (but sharing the same kernel manager) with a direct call to kernel_manager.start_kernel() (for example in the Voila handler).

Prefixing kernel ids could help, but there might be a better way to achieve this though.

jtpio avatar May 28 '20 14:05 jtpio

Hmm - are you expecting things like /api/kernels (to get a list of running kernels) to return different results depending on which handler serviced that request? Or a DELETE /api/kernels/<kernel_id> issued from the notebook front-end to NOT shutdown that kernel if <kernel_id> was a Voila-started kernel?

kevin-bates avatar May 28 '20 14:05 kevin-bates

Indeed. The prefix would be reflected in the id of the response to GET /api/kernels:

[
  {
    "id": "10f619b3-5d39-49a6-bed2-5218a32ce774",
    "name": "python3",
    "last_activity": "2020-05-28T13:18:14.175879Z",
    "execution_state": "idle",
    "connections": 0
  },
  {
    "id": "8277b7b0-eb11-48c3-bb24-194b902be82e",
    "name": "python3",
    "last_activity": "2020-05-28T13:18:56.735656Z",
    "execution_state": "idle",
    "connections": 0
  },
  {
    "id": "voila-be15ce22-d8ff-4d53-ba6b-2aa77925870f",
    "name": "python3",
    "last_activity": "2020-05-28T13:18:59.548913Z",
    "execution_state": "idle",
    "connections": 0
  }
]

jtpio avatar May 28 '20 14:05 jtpio

Could you please clarify?

By "Indeed" are you suggesting that, yes, you do want to have the respective handlers filter on their respective "id types"?

Or you are realizing that the REST API would be an issue since custom kernel ids would, by default, be exposed and "available" for "actions" (interrupt, restart, shutdown) from different clients?

Thanks.

kevin-bates avatar May 28 '20 15:05 kevin-bates

This was mostly an answer to that question:

are you expecting things like /api/kernels (to get a list of running kernels) to return different results depending on which handler serviced that request?

To say that yes, if a custom kernel_id can be passed when starting the kernels, the response from a GET /api/kernels would be enough to know which handler started the kernel, because the prefix would be in the id field. So a client could retrieve the list of all kernels with a GET /api/kernels, and then filter the list based on the id.

you do want to have the respective handlers filter on their respective "id types"? Or you are realizing that the REST API would be an issue since custom kernel ids would, by default, be exposed and "available" for "actions" (interrupt, restart, shutdown) from different clients?

Thinking about it a bit more, it would probably be better to instantiate a separate kernel manager, and add extra handlers (for example /extensionapp/api/kernels/) to properly isolate the list of kernels. This would help avoid sharing the same default kernel manager.

So a regular GET /api/kernels/ would list the kernels started via the notebook frontend. And a GET /extensionapp/api/kernels/ would list the kernels started by the extension app handler (provided that the ExtensionApp instantiates its own kernel manager).

jtpio avatar May 28 '20 15:05 jtpio

We had a talk about this just now at the jupyter_server meeting. I love the idea of BYOKID bring your own kernel id. I have read across this conversation but need to put my hands in code to better understand the current limits. A few questions in the mean time:

  • I understand there is an issue today if someone want to BYOKID to create the kernel - There is no issue today if someone want to join a existing kernel with the running id - Correct?
  • Would the concept of BYOKID be enough to cover Voila case? I am happy to investigate this option if it covers the needs. In that case, Voila frontend would be responsible to filter the kernels.
  • I am not sure a separated kernel manager is today possible (the services are singleton).

echarles avatar May 28 '20 16:05 echarles

@jtpio - Thanks for the clarification - that all makes sense and seems reasonable to partition based on endpoints and instances.

@echarles

There is no issue today if someone want to join a existing kernel with the running id - Correct?

That's my understanding, yes.

I am not sure a separated kernel manager is today possible (the services are singleton).

By creating their own instance of MappingKernelManager and assuming Voila doesn't go through (or use) sessions to manage kernels, I think they'd be okay because only their kernel handlers - of which all would need to be implemented - would hit their instance of MKM (and vice versa for the 'traditional' applications). Of course, issues may be encountered when getting into the weeds, but I suspect it would work. MKM itself isn't a singleton, just that the server only creates a single instance of it currently.

That said, this all feels a bit messy and heavy-handed. I thought there was a way for an extension to say - "I'm exclusive - no other extensions are allowed". Is that approach not viable?

Also, please keep in mind that jupyter_server will be changing the implementation of MappingKernelManager to use Kernel Providers at some point and this will impact subclasses. At that time, MappingKernelManager will no longer derive from jupyter_client's MultiKernelManager and the KernelManager instances it maintains will be a product of a KernelProvider (where different KernelProviders are in play).

kevin-bates avatar May 28 '20 16:05 kevin-bates

MKM itself isn't a singleton, just that the server only creates a single instance of it currently.

Cool, then the separated handler can make sense.

I thought there was a way for an extension to say - "I'm exclusive - no other extensions are allowed". Is that approach not viable?

load_other_extensions can be set to False, true. We had a lot of refactoring (and will continue...) - We need to make sure this is still working as expected in the 3 current ways to launch a server with an extensions ([1] jupyter server.... - [2] python. -m .... - [3] jupyter my_extension)

echarles avatar May 28 '20 17:05 echarles

By creating their own instance of MappingKernelManager and assuming Voila doesn't go through (or use) sessions to manage kernels

There is some exploratory work for using a separate kernel manager (edit: when used as a server extension) in https://github.com/voila-dashboards/voila/pull/41 (although this branch doesn't use the ExtensionApp yet but the idea should be similar after the switch to ExtensionApp too).

Also, please keep in mind that jupyter_server will be changing the implementation of MappingKernelManager to use Kernel Providers at some point and this will impact subclasses

Right, posting a link to the corresponding PR here for reference: https://github.com/jupyter/jupyter_server/pull/112

Would the concept of BYOKID be enough to cover Voila case?

In the end BYOKID might not even be necessary for Voila if Voila always creates its own kernel manager (not decided yet if it's going to be case but still interesting to consider).

However BYOKID might still be useful for other use cases.

jtpio avatar May 29 '20 14:05 jtpio

Sounds good Jeremy. Please note that even with kernel providers, you won't have the isolation that I understand you'd like unless splitting out the handlers and implementing your own MKM (or figuring out another way to make the separation exclusive to the respective clients).

kevin-bates avatar May 29 '20 14:05 kevin-bates