oauthenticator
oauthenticator copied to clipboard
Move group management from generic to base oauthenticator
Motivating use cases
External identity providers providing JupyterHub memberships is an extremely useful feature that should be present not just for GenericOAuthenticator but for all authenticators. But to do so in helpful ways, this PR considers two motivating use cases:
- A hub using
Auth0OAuthenticator. In Auth0,scopes granted are what is used to provide the notion of 'can the user perform this task?', which can be used as group membership. This is what auth0 recommends, there is currently no other way to do 'groups' in Auth0.scopeis insideauth_model, but notauth_state, sincescopeis granted each time the user is logged in. - In
GitHubOAuthenticator, we put the list of teams the user is in insideauth_state. This is the perfect piece of information to use for group membership. oauth_usergets put insideauth_state, and in generalauth_stateis a good place for this kinda group information to be in. Authenticators can put arbitrary stuff insideauth_stateand use them as they wish.
Approaches considered and rejected
- Each authenticator figures out how groups work within it. This was tried out in https://github.com/jupyterhub/oauthenticator/pull/739, and rejected by @minrk in https://github.com/jupyterhub/oauthenticator/pull/735#issuecomment-2041996689. I agree with this rejection.
- Allow group information to be picked up from the
auth_modelwith aauth_model_groups_key. This would be same as the currentclaim_groups_key, but pick from the entireauth_modelinstead of just from the returned user object. This was the tack this PR was taking, primarily because I thought we needed it to handle use case 1 mentioned earlier. But turns out I was wrong - I had thought thatscopewas part ofauth_modelbut notauth_state, but we do! And regardless, I also realized we don't exposeauth_modelanywhere, but we do exposeauth_state. And I had a TODO for 'document what is inauth_model', and while trying to do that, decided we shouldn't expose that to configurable tweaks like this for now. So that was reverted in b337015306f51a8a9ea8e95924a7562bfa1e56ba and a different approach was taken.
Approach this PR takes
The general approach to group management is:
- Authenticator puts something that may be relevant to group membership inside
auth_state. - There's an
auth_state_groups_keythat can be either a callable or dotted specification that generates a list of groups from something inauth_state.
This handles case (2) because list of teams is already in auth_state. And can handle (1) by us putting scope in some form inside auth_state. This also provides a clear extensible mechanism in the future for all group management - get it into auth_state (where it can be used for anything), and pick that out with auth_state_groups_key.
Backwards compatibility
The existing claim_groups_key behavior is preserved, by being passed on to auth_state_groups_key in the base. It has been marked as deprecated.
TODO
- [ ] Update docs to refer to new property
- [ ] Reword the docstring for
auth_state_groups_key
All this functionality would be extremely useful to all the authenticators, so should be in base.
I think an important question now becomes - what is the point of GenericOAuthenticator? Is that just here for backwards compatibility?
I think the answer should be 'yes'. And in the future, anything that relies on OAuth2 functionality should just go to the base OAuth2 provider, and anything specific to any of the providers can go in their own files. And we suggest people who currently use Generic migrate to just using the base OAuth2 provider.
what is the point of GenericOAuthenticator?
I believe GenericOAuthenticator should eventually go away and be merged into the base class, as long as we can do it smoothly and it doesn't complicate things unnecessarily (I don't think it will). I think the big refactor in #526 was a big step toward making that feasible.
Not having done that yet, going forward in general I think we should avoid new features in Generic in favor of putting them in the base class, at least in most cases.
I was thinking about how we can do a very common thing - sync GitHub team memberships and org memberships to JupyterHub groups. With this PR as is, that's actually not possible - team memberships are not part of userdata! But they are part of the auth_model (because there's a populate_teams_in_auth_model property already).
I think there are two paths forward here for groups to work everywhere:
- Lift that into the base Authenticator class, but the key is for things from
auth_model, not userdata. This also allows for using things likescope(which can control group membership with Auth0, from another community we are working with). We can name this appropriately, and the keep a deprecatedclaims_group_keyin Generic that notifies it's deprecated but works the way it already does. - https://github.com/jupyterhub/oauthenticator/pull/739 or similar for each authenticator we want to support good groups.
My preference is to do (1)! There's a clean backwards compat story, and the model of 'put stuff into auth_state, that is available via auth_model, and you can pull stuff out of that for groups' seems clean enough to explain.
Option 1 sounds like a great choice, and also a great opportunity to not inherit the name while preserving compatibility.
This went through a few iterations, but is ready for review again! I've updated the PR body with more detailed information as well. Please take another look when you got time, @minrk :)
After trying to write clear docs for the traitlet, I came to the conclusion that we shouldn't allow usage of these without manage_groups also being True. I've updated a paragraph under Backwards compatibiltiy with my rationale.
With that, this is ready to go!
I found https://github.com/jupyterhub/oauthenticator/issues/709, which I think is closed by this PR. And my change with respect to manage_groups reflects:
do we need to deal with additional groups, not specified upstream somehow? I'd say no, at least for now.
Which I totally agree with. Groups should have a 'single source of truth'.
Can I get someone to hit merge? Thanks!
@yuvipanda thank you for working this!!!
I'm not able to invest time to review this fully atm :/ I saw some minor fixes I'd like to see updated:
- pr is labelled as breaking, but also references "deprecated in 16.4" - i think we should update one thing or the other (would be 17.0 if it is breaking).
- If its breaking, I request a highlight written about what is breaking in the PR description or under the changelog entry to offload subsequent release work.
- I think its great if the PR title (or as a fallback early in the description) highlights new config names explicitly
Thanks @consideRatio.
I'm not able to invest time to review this fully atm :/
I don't think you need to, as long as you don't block it. I think it's seen enough eyes, and it's been sitting here for close to 3 months. In general I'd like us to move faster and be less perfectionist. For example, I see tests are failing now - probably unrelated, as they're in globus. This is super demotivating, and I don't want to lose myself as a contributor. I'm also going to proactively merge other PRs and change my attitude a little bit more, being the change I want to see.
With that said, I appreciate the other 3 points you have raised and the way you have communicated them. I've moved the references to 16.4 to 17.0, and added a section called Breaking Changes to call this out explicitly (rather than inside the Backwards compatibilty section as before. I prefer the PR title as is, but you're welcome to change it if you wish.
These tests are failing on main as well, so I opened https://github.com/jupyterhub/oauthenticator/issues/743.
Back to now thinking this PR is ready to merge.
yay, thank you @manics
@yuvipanda there are numerous hooks to facilitate local customisation. What API should local code in these hooks use to access the list of groups that the user is member to?
For example, depending on group membership, I want to filter my profile list and adjust singleuser server settings. If my code runs in auth_state_hook, options_form, pre_spawn_hook or modify_pod_hook, then it will be passed the spawner instance, or if runs in post_auth_hook it will be passed the authenticator instance and auth_model (not auth_state). I take it that accessing auth_model["groups"] would be discouraged? Accessing spawner.user.groupsyields the same as user.orm_user.groups, a list of SQAlchemy objects rather than a trivial set of strings to test membership in. Should customisations try to work with that, or try to call something like spawner.authenticator.get_user_groups(spawner.user.get_auth_state())?