enterprise_gateway icon indicating copy to clipboard operation
enterprise_gateway copied to clipboard

Add support for Tenant ID

Open kevin-bates opened this issue 2 years ago • 6 comments

Although Enterprise Gateway states it supports multi-tenancy prior to this change, any "tenant" listing active kernels via /api/kernels will see kernels corresponding to every tenant. As a result, the shutdown of one tenant would terminate the kernels of all tenants.

This pull request introduces tenant_id as a means for associating kernels to a given tenant and thereby adds support for multi-tenancy in a minimally viable way.

The GatewayClient object that resides within Jupyter Server will add support for the configuration of a tenant_id. When configured, the env stanza associated with a kernel's start request will include the tenant_id as a value to the env JUPYTER_GATEWAY_TENANT_ID. In addition, the list kernels call will include a query parameter specifying the same tenant_id. Older client applications or admin applications that do not specify a query parameter will see all kernels - just like today.

EG will use the tenant_id in the start request to manage a list of corresponding kernels. When the client requests the list of active kernels, it will use the kernel ids associated with the given tenant-id to filter the results.

If no tenant-id is specified in the start request, the UNIVERSAL_TENANT_ID will be used. This results in common functionality for both new and legacy applications. Applications not configuring a tenant-id will result in the same behavior seen today (which is really only viable for "single-tenant" installations).

Although I haven't created a PR for the GatewayClient work (waiting for this merge), its changes can be found here.

kevin-bates avatar May 26 '22 01:05 kevin-bates

hi @kevin-bates Can multiple users share a single notebook and work on it simultaneously if connected to the same jupyterlab? If yes, then they would also share the same kernel?

rahul26goyal avatar May 29 '22 16:05 rahul26goyal

I'm not sure how real time collaboration (rtc) is implemented at that level but I suspect the "sharing" is at the notebook file level and each user has their own kernel. If they shared kernels then they'd need to share an EG and a tenant ID as well. Is that where your question is headed?

kevin-bates avatar May 29 '22 20:05 kevin-bates

We already have things like KERNEL_USERNAME, which provides a sort of a human-readable tenant_id. What is the drawbacks of using something like that instead of a UUID tenant_id?

lresende avatar Jun 08 '22 16:06 lresende

We already have things like KERNEL_USERNAME, which provides a sort of a human-readable tenant_id. What is the drawbacks of using something like that instead of a UUID tenant_id?

KERNEL_USERNAME is not unique enough - both when the jupyter server is multi-user, but also in order to distinguish between "alice" from two different tenants.

The UUID is not displayed anywhere, only (optionally) configured at the client (jupyter_server). If we went with anything like a "name", then that opens the door for requiring a registry, etc. so that uniqueness can be detected.

kevin-bates avatar Jun 08 '22 16:06 kevin-bates

@lresende, @rahul26goyal - are you available to review this PR?

kevin-bates avatar Jun 28 '22 14:06 kevin-bates

I've moved this to draft. After talking with @lresende, we felt it might be best to hold off on this until we have more firm requirements - particularly with how the tenant identities are configured and managed.

We will revisit down the road as necessary.

kevin-bates avatar Jul 25 '22 21:07 kevin-bates