[ENH] - Make JupyterHub use groups and roles from Keycloak
Feature description
Until now we haven't been using JupyterHub groups and roles much. We have Keycloak as the identity provider and we plan to use groups and roles more in keycloak for permissions overhaul, see following issues
- #2304
- https://github.com/nebari-dev/jhub-apps/issues/11
The main motivation for this is to be able to fetch groups and roles from the JupyterHub API in jhub-apps to be able to decide permissions, since jhub-apps is not supposed to be tied to Nebari, hence would be great to be able to fetch roles and groups from JupyterHub API in jhub-apps.
Relevant links
- https://discourse.jupyter.org/t/is-jupyterhub-rbac-groups-the-same-as-oauth-groups/22412/3
- https://discourse.jupyter.org/t/jupyterhub-keycloak-auth-and-ldap-user-groups/22512
- Working configuration for generic authenticator with Keycloak jupyterhub/oauthenticator#107
- https://discourse.jupyter.org/t/oidc-rbac-possible-to-map-users-to-groups-and-groups-to-roles-where-users-and-groups-are-defined-by-oidc/22426
I reckon, we might have to make changes to our Authenticator to make this happen. https://github.com/nebari-dev/nebari/blob/53194474dfbc8ac1ded81737edc777c17c6bbe97/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/files/gateway_config.py#L63
Definition of done:
- Keycloak roles and groups are accessible from JupyterHub API
https://jupyterhub.readthedocs.io/en/stable/reference/rest-api.html#/default/get_groups
Currently, this is what I get on the fetching groups from JupyterHub:
{
"last_activity": "2024-03-11T15:59:01.194646Z",
"server": null,
"groups": [],
"created": "2022-12-07T16:43:29.002132Z",
"auth_state": null,
"name": "[email protected]",
"kind": "user",
"pending": null,
"admin": true,
"roles": [
"user",
"admin"
],
"servers": {}
"session_id": null,
"scopes": ["truncated"]
}
You can see the groups are empty and roles are also not the ones from keycloak.
Value and/or benefit
This will help us implement app sharing and permissioning seamlessly with keycloak.
Anything else?
No response
Thanks for the extra details!
I reckon, we might have to make changes to our Authenticator to make this happen.
From a quick glance it looks that is only set on c.DaskGateway.authenticator_class I guess we should rename it to NebariDaskAuthenticator and add another one to set on c.JupyterHub.authenticator_class.
From a quick glance it looks that is only set on c.DaskGateway.authenticator_class I guess we should rename it to NebariDaskAuthenticator and add another one to set on c.JupyterHub.authenticator_class.
Sounds reasonable to me.
Looking at the codebase, I see that the authenticator class is set in nebari here:
https://github.com/nebari-dev/nebari/blob/ff38679218ebea0090eef045b3922b10ef22401d/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/main.tf#L143-L165
I also see that there is a JupyterHub.authenticator_class set in jhub-apps to use the NativeAuthenticator:
# Authenticate users with Native Authenticator
c.JupyterHub.authenticator_class = "nativeauthenticator.NativeAuthenticator"
# Allow anyone to sign-up without approval
c.NativeAuthenticator.open_signup = True
Is this for testing only, or does it take precedence over the one from nebari?
Is this for testing only, or does it take precedence over the one from nebari?
Yes, only for testing. That's an example jupyterhub_config.py for docker spawner.
Ok, populating groups is rather easy with the latest (not yet released) oauthenticator version. I tested it with https://github.com/nebari-dev/nebari-docker-images/pull/127 and all that is needed is adding one line in config (and updating some deprecated keys as described in https://github.com/nebari-dev/nebari-docker-images/pull/127#issuecomment-1999698300):
GenericOAuthenticator = {
+ manage_groups = true
client_id = module.jupyterhub-openid-client.config.client_id
client_secret = module.jupyterhub-openid-client.config.client_secret
oauth_callback_url = "https://${var.external-url}/hub/oauth_callback"
authorize_url = module.jupyterhub-openid-client.config.authentication_url
token_url = module.jupyterhub-openid-client.config.token_url
userdata_url = module.jupyterhub-openid-client.config.userinfo_url
login_service = "Keycloak"
- username_key = "preferred_username"
+ username_claim = "preferred_username"
claim_groups_key = "roles"
allowed_groups = ["jupyterhub_admin", "jupyterhub_developer"]
admin_groups = ["jupyterhub_admin"]
- tls_verify = false
+ validate_server_cert = false
}
(we should probably toggle validate_server_cert to true and only allow it to be false during local deployment; I opened https://github.com/nebari-dev/nebari/issues/2329).
Roles are a bit more tricky and will require actually overriding the Authenticator class and possibly more work. I will open a PR. I can target the older version of oauthenticator for now as the PR adding support for manage_groups was not yet released.
Here are details on how the API responses look like with manage_groups on:
That results in:
â [I JupyterHub user:316] Adding user mike to group(s): {'grafana_developer', 'query-users', 'manage-identity-providers', 'manage-clients', 'manage-account', 'manage-realm', 'view-profile', 'argo-admin', 'dask_gateway_developer', 'grafana_admin', 'view-identity-providers', 'jupyterhub_admin', 'view-realm', 'view-authoriz â
â [I JupyterHub user:328] Creating new group grafana_developer for user mike â
â [I JupyterHub user:328] Creating new group query-users for user mike â
â [I JupyterHub user:328] Creating new group manage-identity-providers for user mike â
â [I JupyterHub user:328] Creating new group manage-clients for user mike â
â [I JupyterHub user:328] Creating new group manage-account for user mike â
â [I JupyterHub user:328] Creating new group manage-realm for user mike â
â [I JupyterHub user:328] Creating new group view-profile for user mike â
â [I JupyterHub user:328] Creating new group argo-admin for user mike â
â [I JupyterHub user:328] Creating new group dask_gateway_developer for user mike â
â [I JupyterHub user:328] Creating new group grafana_admin for user mike â
â [I JupyterHub user:328] Creating new group view-identity-providers for user mike â
â [I JupyterHub user:328] Creating new group jupyterhub_admin for user mike â
â [I JupyterHub user:328] Creating new group view-realm for user mike â
â [I JupyterHub user:328] Creating new group view-authorization for user mike â
â [I JupyterHub user:328] Creating new group jupyterhub_developer for user mike â
â [I JupyterHub user:328] Creating new group view-clients for user mike â
â [I JupyterHub user:328] Creating new group query-groups for user mike â
â [I JupyterHub user:328] Creating new group conda_store_developer for user mike â
â [I JupyterHub user:328] Creating new group view-events for user mike â
â [I JupyterHub user:328] Creating new group query-realms for user mike â
â [I JupyterHub user:328] Creating new group impersonation for user mike â
â [I JupyterHub user:328] Creating new group realm-admin for user mike â
â [I JupyterHub user:328] Creating new group create-client for user mike â
â [I JupyterHub user:328] Creating new group conda_store_superadmin for user mike â
â [I JupyterHub user:328] Creating new group argo-viewer for user mike â
â [I JupyterHub user:328] Creating new group argo-developer for user mike â
â [I JupyterHub user:328] Creating new group manage-events for user mike â
â [I JupyterHub user:328] Creating new group grafana_viewer for user mike â
â [I JupyterHub user:328] Creating new group manage-users for user mike â
â [I JupyterHub user:328] Creating new group dask_gateway_admin for user mike â
â [I JupyterHub user:328] Creating new group manage-account-links for user mike â
â [I JupyterHub user:328] Creating new group manage-authorization for user mike â
â [I JupyterHub user:328] Creating new group query-clients for user mike â
â [I JupyterHub user:328] Creating new group view-users for user mike â
â [I JupyterHub user:328] Creating new group conda_store_admin for user mike â
â [I JupyterHub base:837] User logged in: mike â
Then for /api/users I get:
[
{
"admin": true,
"groups": [
"grafana_developer",
"query-users",
"manage-identity-providers",
"manage-clients",
"manage-account",
"manage-realm",
"view-profile",
"argo-admin",
"dask_gateway_developer",
"grafana_admin",
"view-identity-providers",
"jupyterhub_admin",
"view-realm",
"view-authorization",
"jupyterhub_developer",
"view-clients",
"query-groups",
"conda_store_developer",
"view-events",
"query-realms",
"impersonation",
"realm-admin",
"create-client",
"conda_store_superadmin",
"argo-viewer",
"argo-developer",
"manage-events",
"grafana_viewer",
"manage-users",
"dask_gateway_admin",
"manage-account-links",
"manage-authorization",
"query-clients",
"view-users",
"conda_store_admin"
],
"pending": null,
"auth_state": null,
"kind": "user",
"server": "/user/mike/",
"roles": [
"user",
"admin"
],
"name": "mike"
}
]
And for /api/groups:
[
{
"properties": {},
"roles": [],
"name": "grafana_developer",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "query-users",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "manage-identity-providers",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "manage-clients",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "manage-account",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "manage-realm",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "view-profile",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "argo-admin",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "dask_gateway_developer",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "grafana_admin",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "view-identity-providers",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "jupyterhub_admin",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "view-realm",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "view-authorization",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "jupyterhub_developer",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "view-clients",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "query-groups",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "conda_store_developer",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "view-events",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "query-realms",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "impersonation",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "realm-admin",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "create-client",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "conda_store_superadmin",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "argo-viewer",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "argo-developer",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "manage-events",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "grafana_viewer",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "manage-users",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "dask_gateway_admin",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "manage-account-links",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "manage-authorization",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "query-clients",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "view-users",
"kind": "group",
"users": [
"mike"
]
},
{
"properties": {},
"roles": [],
"name": "conda_store_admin",
"kind": "group",
"users": [
"mike"
]
}
]
I can target the older version of
oauthenticatorfor now as the PR adding support for manage_groups was not yet released.
Well, it looks like targeting the very outdated version we have, while possible, may not be worth it because the divergence in codebase is significant as accummulated over two years since it was not updated.
Currently JupyterHub roles have to be defined at configuration time. There is an issue proposing to allow roles to be configured at runtime:
- https://github.com/jupyterhub/jupyterhub/issues/3858
There is a (stale?) PR adding a REST API for runtime role creation:
- https://github.com/jupyterhub/jupyterhub/pull/3980
But possibly more handy would be implementing manage_roles support in JupyterHub (see https://github.com/jupyterhub/jupyterhub/issues/3858#issuecomment-1999996881).
@aktech can we pre-define a set of roles and only use Keycloak to get the user-role association (for the predefined roles) or do we need to be able to get arbitrary roles from Keycloak? If we need arbitrary roles the way forward is to fetch the roles from Keycloak at JupyterHub configuration (or contribute upstream, e.g. the manage_roles approach). The limitation with fetching from Keycloak at JupyterHub configuration time is that any changes to roles require restart of JupyterHub,
I infer that fetching from Keycloak at JupyterHub config time should is feasible as Keycloak starts up before JupyterHub gets setup:
https://github.com/nebari-dev/nebari/blob/a06fcc5aec9757e54cc1d670d119477cb6a7056c/src/nebari/plugins.py#L35-L38
Well, it looks like targeting the very outdated version we have, while possible, may not be worth it because the divergence in codebase is significant as accummulated over two years since it was not updated.
agreed, makes sense.
@aktech can we pre-define a set of roles and only use Keycloak to get the user-role association
I believe that'll do for now as long as its dynamic, as in roles association show up realtime if there are any changes to the roles association in the keycloak, it doesn't require jupyterhub to restart to show up in the api.
or do we need to be able to get arbitrary roles from Keycloak?
Not urgent from app sharing point of view, we can definitely target that later.
Here are details on how the API responses look like with manage_groups on:
If a groups is deleted in keycloak, is that reflected in the JupyterHub immidiately?
If a groups is deleted in keycloak, is that reflected in the JupyterHub immidiately?
No. Currently the user needs to logout and login back for it to be reflected. However, we can set:
Authenticator.refresh_pre_spawn = Trueto ensure that the groups/roles are fetched from keycloak before spawning a server- the auth cookie expiration to something ridiculously short like 5 minutes, so that it will force checking back the auth from keycloak every so often (this may have side effects and is probably a bad idea).
It might be possible to configure keycloak to send a REST API request to JupyterHub to trigger the refresh. There is an endpoint for removing a user from a group and for removing a group altogether, but there are no corresponding endpoints for roles (but there is a draft PR for it).
No. Currently the user needs to logout and login back for it to be reflected. However, we can set:
I think this is reasonable for our use case, the alternatives are not feasible.
Making a call to JupyterHub API, on: https://<NEBARI-URL>/hub/api/users/[email protected] I noticed the following:
{
"roles": [
"admin",
"user"
],
"last_activity": "2024-04-03T13:47:46.510679Z",
"server": null,
"pending": null,
"admin": true,
"groups": [],
"created": "2024-03-14T17:06:47.354116Z",
"name": "[email protected]",
"kind": "user",
"auth_state": {
"access_token": "<SANITIZED>",
"refresh_token": "<SANITIZED>",
"oauth_user": {
"sub": "<SANITIZED>",
"email_verified": false,
"roles": [
"jupyterhub_admin",
"jupyterhub_developer",
"dask_gateway_developer",
"grafana_viewer",
"argo-viewer",
"conda_store_developer",
"manage-account",
"manage-account-links",
"view-profile"
],
"name": "Amit Kumar",
"groups": [
"/analyst"
],
"jupyterlab_profiles": [
"Small Instance"
],
"preferred_username": "[email protected]",
"given_name": "Amit ",
"family_name": "Kumar",
"email": "[email protected]"
},
"scope": [
"profile",
"email"
]
},
"servers": {}
}
I see the keycloak roles and groups are present in:
auth_state.oauth_user.rolesauth_state.oauth_user.groups
I found this while investigating how dask_gateway permissions work, https://github.com/nebari-dev/nebari/blob/80949136daea59b358e85c8ae49f9349cf315bb0/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/dask-gateway/files/gateway_config.py#L75
If the structure of the response is similar for any other authenticator besides keycloak (which needs investigation), then we might just be fine, using the groups and roles from auth_state in jhub-apps, any thoughts?
If the structure of the response is similar for any other authenticator besides keycloak (which needs investigation), then we might just be fine, using the groups and roles from auth_state in jhub-apps, any thoughts?
Well this might not work out of the box, as for everything we need to be able to map them to jupyterhub roles/groups too.
Like for example:
If an admin creates a role on keycloak that says a user has the ability to share a server, then that needs to be added in jupyterhub to actually have the permissions, equivalent to:
c.JupyterHub.load_roles = [
{
"name": "user",
"scopes": ["self", "shares!user", "read:users:name", "read:groups:name"],
},
]
This also means roles are not just a string, it could be an object. Which can be defined in keycloak as: name as the name of role in keycloak and scopes as role attributes.
If the structure of the response is similar for any other authenticator besides keycloak
So in OAuth this gets selected using claim_groups_key config for groups which can come from env variable OAUTH2_GROUPS_KEY; subclasses can also override get_user_groups if the structure is non-trivial but I don't think this happens in any of the common OAuth. So at least I would not rely on the value being in "groups" key.
This also means roles are not just a string, it could be an object. Which can be defined in keycloak as: name as the name of role in keycloak and scopes as role attributes.
Right, so we will need to pass the role attributes from Keycloak via oauth so that they are accessible in oauth_user; or if we have roles defined on startup we could just define them in Authenticator.load_managed_roles which will take the same format as c.JupyterHub.load_roles.
Right, so we will need to pass the role attributes from Keycloak via oauth so that they are accessible in oauth_user; or if we have roles defined on startup we could just define them in Authenticator.load_managed_roles which will take the same format as c.JupyterHub.load_roles.
Yep, we may have some pre-defined roles but mostly we want to import from keycloak, this gives the most flexibility in terms of customisation, as different deployments (at different orgs) might need different set of permissions (roles) for different set of users/groups.
The PR implementing managed roles in JupyterHub was merged today and will be included in JupyterHub 5.0.
Awesome, that's great news! Is this one: https://github.com/jupyterhub/jupyterhub/issues/3858 getting closed completely? I see it referenced in your PR.
I think it may stay open as it lists a number of other ideas like managing roles via REST API, or allowing users to grant roles (I think less needed now given that we have share codes).
Ah, I see. After your PR, are we able to dynamically update roles (like sync from keycloak), without restarting hub?
Yes.