feat: oauth authorization flow
[!IMPORTANT] đ Stay up-to-date with PostHog coding conventions for a smoother review.
NOTE: Would appreciate a thorough code review since it touches authentication, though this PR does not allow access to any endpoints yet (the access tokens do nothing)
Problem
We want an OAuth flow that users can use to authenticate 3rd party applications. This should replace personal api keys for integrations with PostHog.
Changes
- Add an OAuth authorization page and an OAuth application metadata endpoint
- Add
/oauth/authorizeand/oauth/tokenendpoints for OAuth 2 authorization code flow - Add
/.well-known/jwks.jsonfor clients to retrieve the public key and/.well-known/openid-configuration/for oauth endpoint discovery - Add token revocation endpoint
/oauth/revoke/for revoking tokens - Add token introspection endpoint
/oauth/introspect/ - Add
/oauth/userinfo/endpoint for getting information about a user
How did you test this code?
Added a lot of unit test coverage for the authorize and token endpoint
Video walkthrough of local flow: https://github.com/user-attachments/assets/170c8e8c-e9d7-49b4-9301-159bc8b8a8af
Local testing instructions for authorization code flow with a confidential client:
-
Go to
http://localhost:8010/admin/and create a new OAuth Application. Note down theCLIENT_IDandCLIENT_SECRET.- Redirect uris: "https://google.com"
- Client type: Confidential
- Authorization grant type: Authorization code
- Algorithm: RSA-256
- Organization: any, choose one of your local ones
-
Visit the authorization page:
http://localhost:8010/oauth/authorize/?response_type=code&code_challenge=22K5_OTunigWRcbNo4nbmTJRX_xOun6z1-iJZByc6Ps&code_challenge_method=S256&client_id=CLIENT_ID&redirect_uri=https://google.com&scope=experiment:read+openid
-
Pick a scope and click authorize, you should get a code from that in the URL you are redirected to (https://google.com)
-
Exchange the authorization code
curl -X POST \
-H "Cache-Control: no-cache" \
-H "Content-Type: application/x-www-form-urlencoded" \
"http://localhost:8010/oauth/token/" \
-d "client_id=CLIENT_ID" \
-d "client_secret=CLIENT_SECRET" \
-d "code_verifier=16GH0JGSTGPVVS7Y2TOIU52Y0IW8E823HMY5H9SGOP4ALH7KS9FE6ZOTNB6QK030YJBM4Z4TU7YLJENONBIPT4HHPBRE0YS72IA9BNOC" \
-d "redirect_uri=https://google.com" \
-d "grant_type=authorization_code"
That should get you an access token and a refresh token. This process would usually be performed by an OAuth client.
đ¸ UI snapshots have been updated
1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
-
chromium: 0 added, 1 modified, 0 deleted (diff for shard 2) -
webkit: 0 added, 0 modified, 0 deleted
Triggered by this commit.
đ¸ UI snapshots have been updated
3 snapshot changes in total. 0 added, 3 modified, 0 deleted:
-
chromium: 0 added, 3 modified, 0 deleted (diff for shard 1, diff for shard 2) -
webkit: 0 added, 0 modified, 0 deleted
Triggered by this commit.
Size Change: +473 B (+0.02%)
Total Size: 2.58 MB
âšī¸ View Unchanged
| Filename | Size | Change |
|---|---|---|
frontend/dist/toolbar.js |
2.58 MB | +473 B (+0.02%) |
Thank you for the review đ
Awesome work putting this together. I've tested everything locally and added my thoughts below. Just a a first pass happy to do more. Sorry for so many bullets. Feel so close đ loving the new authorize UI.
Things that came up while testing:
- We need to add better handling for the scopes, for example if I pass in
scope=experiment%3Aread%20openid%20surveys%3athisisanerrorthis I seeRead access to surveys:thisisanerrorand then I get redirected to google.com?error=invalid_scope but no other errors handled in PostHog UI
Yeah we should be handling these errors and redirecting before the user views the page, rather than when they hit authorize, I'll update it to do that.
As for passing the error back to the redirect uri without showing an error in PostHog, the spec requires us to do that. If we return a 200 here the OAuth client doesn't get to decide what to do with the error. Feels a bit awkward when the redirect uri is google.com, but usually that'd be an OAuth client that is expecting this error.
- If I'm on an account where the team hasn't finish onboarding I'm redirected to http://localhost:8010/project/1/products?next=%2Foauth%2Fauthorize. We should probably bipass onboarding? After onboarding it redirects correctly-
Good callout, will update that.
- I got an error around
OIDC_RSA_PRIVATE_KEYnot being set locally. I needed to change it manually inweb.pyto a random string
Updated the issue notes
- There is a
View on sitebutton on the the admin oauth app page (top right) that doesn't work and we should remove/- I'm not sure what
code_challengeshould be set to - is it always the same?- If
client_type,authorization_grant_type,algorithmare always going to be the same value we should make this read only in admin with a default value
I'll deal with the admin portal ones in a separate PR. Django OAuth Toolkit ships with it's own admin views, so we need to override them and disable them from being registered automatically. It's not hard, just some boilerplate I'd rather leave out of this PR.
- Need to add loading to the submit button on the authorize page, feels like it didn't work but it's just waiting on the backend redirect
Yup - that'd be helpful, will add it.
- Your example curl is missing
codeparam and what iscode_verifier? I'm getting an error{"error": "invalid_client"}and I think it's because of thecode_verifier
Updated the issue notes. Whilst testing you should be able to use the values I've given you here for code_challenge and code_verifier. This is for PKCE - the code_challenge is a sha hash of the code_verifier, any code_verifier will do so long as you hash it to get the corresponding code_challenge.
Note: It's actually code_challenge = base64_encode(sha256(ASCII(code_verifier))) - why do they make this so complicated đ¤¯
Other questions
- How will users create their own oauth applications?
Plan is to let org admins create/edit applications from their settings page.
- Are we limiting who can create oauth application?
Not planning to. We might want to put rate limits per application (on top of the existing rate limits per user) to avoid one application hogging all of a users calls, but I think we can implement that if the situation arises.
- How can we test this out with a customer? Should we email existing platforms using the API and asking for personal API keys to try it out?
I think we can test this internally first and then roll it out in an open beta
I haven't ran it, but glanced through the code, having built out personal API keys a while ago. Generally looking good! It seems the tricky lifting is done by oauth2_provider.
Yep, tried to avoid changing as much of django-oauth-toolkit as possible, since that is battle tested well in production.
The 1900 lines of tests is mind-blowing. How much of it is hand-rolled vs. LLM?
There are around 85 tests there, about 30 of them are hand done covering the main flows and problems, and the rest were LLM generated to try and cover edge cases. I tried to mainly cover the /authorize endpoint since that is the one we're modifying from DOT, we've got good coverage there for common implementation issues and added some tests for the full flow / OIDC stuff to make sure things are working okay.
BTW I wish I had the time to just build something on top of this as part of review, that would really reveal everything. OAuth in our Zapier integration wen
Yeah as a testing strategy here I think it makes sense to restrict using OAuth access tokens to only internal users initially, and build out OAuth support in a few places (e.g. MCP server, toolbar) to test the flows as a client. This will reveal any UX issues, but it doesn't help much with common attack vectors, for that the unit tests here are most helpful.
We're also adding stricter requirements than the spec requires (e.g. PKCE is required for all clients even confidential ones, we're rotating refresh tokens when a new access token is requested etc.).
I think if we have issues, they are most likely to be in the validation of the access tokens rather than issuing them, so I'll tag you for review for those PRs since the logic will be similar to personal api keys.
Okay this is ready for a re-review, I've also added some Storybook snapshots.
Obviously with it being auth related this is a risky PR, but it'll be getting plenty of testing internally before it's rolled out to iron out any UX problems and check for implementation issues.
For now, the access tokens aren't giving access to anything. I'd suggest we gate any tokens to only work for team 2 whilst we build out some example clients, and only roll out openly once we are happy with everything.