nebari Get JupyterHub `groups` from Keycloak, support `oauthenticator` 16.3+

Reference Issues or PRs

A task for https://github.com/nebari-dev/nebari/issues/2308
Can be merged straightaway but for the groups to get populated from keycloak requires https://github.com/nebari-dev/nebari-docker-images/pull/131

What does this implement/fix?

Put a x in the boxes that apply

[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds a feature)
[ ] Breaking change (fix or feature that would cause existing features not to work as expected)
[ ] Documentation Update
[ ] Code style update (formatting, renaming)
[ ] Refactoring (no functional changes, no API changes)
[ ] Build related changes
[ ] Other (please describe):

Testing

[x] Did you test the pull request locally?
[ ] Did you add new tests?

To test:

Set the image to use https://github.com/nebari-dev/nebari-docker-images/pull/131:

default_images:
  jupyterhub: quay.io/nebari/nebari-jupyterhub:oauthenticator-16.3.0

Deploy
Open https://your-nebari-deployment/hub/token
Open Dev Tools in your browser, switch to Network tab
Press "Request new API token"
Right click on the request in Network tab → copy as fetch
Paste the request, change method to "GET", remove "body", change URL to /hub/api/users
Re-send the request
Click on the new request, preview the response
It should contain groups from keycloak e.g.:

[
    {
        "kind": "user",
        "auth_state": null,
        "groups": [
            "grafana_developer",
            "query-users",
            "manage-identity-providers",
            "manage-clients",
            "manage-account",
            "manage-realm",
            "view-profile",
            "argo-admin",
            "dask_gateway_developer",
            "grafana_admin",
            "view-identity-providers",
            "jupyterhub_admin",
            "view-realm",
            "view-authorization",
            "jupyterhub_developer",
            "view-clients",
            "query-groups",
            "conda_store_developer",
            "view-events",
            "query-realms",
            "impersonation",
            "realm-admin",
            "create-client",
            "conda_store_superadmin",
            "argo-viewer",
            "argo-developer",
            "manage-events",
            "grafana_viewer",
            "manage-users",
            "dask_gateway_admin",
            "manage-account-links",
            "manage-authorization",
            "query-clients",
            "view-users",
            "conda_store_admin"
        ],
        "admin": true,
        "roles": [
            "user",
            "admin"
        ],
        "name": "USER",
        "pending": null,
        "server": "/user/USER/",
        "servers": {
            "": {
                "name": "",
                "pending": null,
                "ready": true,
                "stopped": false,
                "url": "/user/USER/",
                "user_options": {
                    "profile": "small-instance"
                },
                "progress_url": "/hub/api/users/USER/server/progress",
                "state": {
                    "pod_name": "jupyter-USER"
                }
            }
        }
    }
]

Any other comments?

Mar 27 '24 16:03 krassowski

The cypress failure indicates that permissions did change. It looks like we need to adjust test permissions for the test to pass:

First Test -- Check Nebari login and start JupyterLab (failed)

Mar 27 '24 17:03 krassowski

I can reproduce the failure when using the old oauthenticator version; the intent of this PR was to make it backward compatible, so I will see if we can change something to make it such.

Edit: I actually see this with the new version too.

Mar 27 '24 17:03 krassowski

Previously the keycloak roles where mapped to JupyterHub groups:

https://github.com/nebari-dev/nebari/blob/3e1dfde9c56cfe6c151b36333713d6fbea30bf86/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/main.tf#L247-L251

https://github.com/nebari-dev/nebari/blob/3e1dfde9c56cfe6c151b36333713d6fbea30bf86/src/_nebari/stages/kubernetes_services/template/modules/kubernetes/services/jupyterhub/main.tf#L161-L162

Of note previously extracting the group data (for some reason from roles) was working, but these groups were not persisted in JupyterHub (which is why we are moving to oauthenticator 16.3+).

The logs:

│ [I 2024-03-27 18:09:45.199 JupyterHub generic:185] Validating if user claim groups match any of ['jupyterhub_admin', 'jupyterhub_developer']                                                                                                                                                        │
│ [W 2024-03-27 18:09:45.199 JupyterHub base:843] Failed login for unknown user                                                                                                                                                                                                                       │

highlight that this is caused by the misalignment between allowed_groups and groups from keycloak. These are (from keycloak, allowed):

['/analyst', '/developer', '/users'], ['jupyterhub_admin', 'jupyterhub_developer'] # 15.x
{'/developer', '/users', '/analyst'}, {'jupyterhub_admin', 'jupyterhub_developer'} # 16.3

Mar 27 '24 18:03 krassowski

Looking more through the codebase, it appears that the idea of mapping from keycloak group to a service-specific group roles is prevalent in nebari.

I think that the solution is to align the JupyterHub and keycloak roles as follows:

change the claim_groups_key to "groups" (already done)
change allowed_groups in JupyterHub to ['/analyst', '/developer', '/admin'] (to be done)
fetch the roles from keycloak in addition to "groups"

@aktech any thoughts?

Mar 29 '24 13:03 krassowski

change the claim_groups_key to "groups" (already done)

I assume this will look for the key "groups" in the keycloak's userdata url to populate groups in JupyterHub? If yes, then it makes sense.

change allowed_groups in JupyterHub to ['/analyst', '/developer', '/admin'] (to be done)

Will that prevent users in any other group to login to JupyterHub (Nebari)?

Context on this: We would want people to be able to create any number of groups with any roles (pre-existing) attached to them. The arbitrary groups creation is already possible. The arbitrary roles creation is not possible yet as they are static in nebari at the moment (explicitly mapped to specific service roles), but after the permissions overhaul, we should be able to create arbitrary roles, like say "I want to create a role to have read access to conda environments in the xyz namespace", hence it's important to be able to fetch all groups/roles from keycloak, irrespective of the fact they are created at the deployment time or after deployment via keycloak's GUI.

fetch the roles from keycloak in addition to "groups"

Yes.

Apr 01 '24 15:04 aktech

Will that prevent users in any other group to login to JupyterHub (Nebari)?

Yes. But! We don't need to use allowed_groups for allow-listing users. There is a few choices here:

allowed_users/admin_users - well, not very flexible
check_allowed()/is_admin() - we can implement arbitrary logic
do nothing and just rely on roles

Ultimately I think we want the last one, but at this time the roles syncing is work in progress.

I think I can already make it work by overriding check_allowed()/is_admin() in the Authenticator because the auth_state already does contain the roles from Keycloak.

Do we want to do that in this PR?

Apr 01 '24 15:04 krassowski

I am curious, do we even need this allowed_*? As in, is it necessary to have them? I think it's fine to allow all groups and roles from keycloak.

do nothing and just rely on roles

Will this fetch all the groups from keycloak? If yes, then this is what we need.

Ultimately I think we want the last one, but at this time the roles syncing is work in progress. I think I can already make it work by overriding check_allowed()/is_admin() in the Authenticator because the auth_state already does contain the roles from Keycloak.

Since we're fetching all the groups as per third option, this is not required.

Do we want to do that in this PR?

Certainly not, separate PR is better infact.

Apr 01 '24 16:04 aktech

I am curious, do we even need this allowed_*? As in, is it necessary to have them? I think it's fine to allow all groups and roles from keycloak.

By default OAuthenticator will not allow any user unless they meet one of the allowed_* rules, docs:

Default behavior: nobody is allowed!

The default behavior of OAuthenticator (starting with version 16) is to block all users unless explicitly authorized via some allow configuration. If you want anyone to be able to use your hub, you must specify at least one allow configuration.

Changed in version 16: Prior to OAuthenticator 16, allow_all was implied if no other allow configuration was specified. Starting from 16, allow_all can only be enabled explicitly.

What you are proposing is essentially setting allow_all = True. We can do that. I am not quite if there are any downsides, but here is what the docs has to say about it:

This is appropriate when you use an authentication provider (e.g. an institutional single-sign-on provider), where everyone who has an account in the provider should have access to your Hub. It may also be appropriate for unadvertised short-lived hubs, e.g. dedicated hubs for workshops that will be shutdown after a day, where you may decide it is acceptable to allow anyone who finds your hub to login.

Is there a scenario when a nebari user would be allowed access to say conda-store but not to JupyterHub? I guess it does not make sense because conda-store is a resource for JupyterHub users, and any user who is trusted with conda-store would usually also trusted to access JupyterHub... unless in a high confidentiality deployment when someone wanted to get an external eye on conda-store config, but without exposing the shared file system? But we can also have fine-grained file system permissions so I guess it does not matter if configured properly?

Apr 04 '24 13:04 krassowski

FYI while I was referencing OAuthenticator above, this behaviour is also going to be the same in JupyterHub 5.0 in general, see https://github.com/jupyterhub/jupyterhub/pull/4701.

Apr 04 '24 13:04 krassowski

By default OAuthenticator will not allow any user unless they meet one of the allowed_* rules, docs:

It's fine to have allowed_* as long as it can be synced with keycloak, ideally without restarting the hub. Imagine, we add a new group with x, y, z (roles) permissions and we would like to make sure it takes effect without having to restart the hub. If the user needs to logout/login, that's a reasonable scenario.

Is there a scenario when a nebari user would be allowed access to say conda-store but not to JupyterHub?

I don't think so.

I guess it does not make sense because conda-store is a resource for JupyterHub users, and any user who is trusted with conda-store would usually also trusted to access JupyterHub... unless in a high confidentiality deployment when someone wanted to get an external eye on conda-store config, but without exposing the shared file system? But we can also have fine-grained file system permissions so I guess it does not matter if configured properly?

Yes, that makes sense. Regarding fine-grained permissions here is a proposal: https://github.com/nebari-dev/governance/issues/47

Apr 04 '24 13:04 aktech

It's fine to have allowed_* as long as it can be synced with keycloak, ideally without restarting the hub. Imagine, we add a new group with x, y, z (roles) permissions and we would like to make sure it takes effect without having to restart the hub. If the user needs to logout/login, that's a reasonable scenario.

Also, If this is true, this PR is good to merge.

Apr 04 '24 13:04 aktech

It's fine to have allowed_* as long as it can be synced with keycloak, ideally without restarting the hub. Imagine, we add a new group with x, y, z (roles) permissions and we would like to make sure it takes effect without having to restart the hub. If the user needs to logout/login, that's a reasonable scenario.

The groups are synced from keycloak, but allowed_groups are hard-coded. Is that what you meant?

Apr 04 '24 13:04 krassowski

The groups are synced from keycloak, but allowed_groups are hard-coded. Is that what you meant?

I meant all groups should be allowed, but I see what's happening. I understand now that we need to make sure every user who's allowed to login to Nebari (JupyterHub), needs to be in one of the allowed_groups (which we do already now), that's fine, we can have minimal permissions in one of the allowed_groups and make sure every user is there by default so that they can login.

So, no change required now, this PR looks good to me.

Apr 04 '24 13:04 aktech