sso icon indicating copy to clipboard operation
sso copied to clipboard

sso-auth: Azure AD provider

Open sporkmonger opened this issue 7 years ago • 60 comments

This pull request implements support for Azure AD as an identity provider. It supports groups passed to the upstream via the X-Forwarded-Groups header.

Notes

Requires "Read All Groups" / "Read Directory Data" app-level permissions, as well as "Sign in and read user profile" delegated permissions. The former requires either full Azure admin role or Cloud app admin role in order to approve. The following environment variables must be set:

PROVIDER: 'azure_v2'
AZURE_TENANT: '<your-tenant-uuid-here>'
APPROVAL_PROMPT: 'consent'

I haven't fiddled with timeouts as part of this PR, but they probably need to be extended because first-time sign-in can take much longer due to requiring a large number of API calls. These are done concurrently, but even so, there's a significant risk of timeout waiting on the API calls to finish on the first go-around.

This PR also adds the OIDC provider, though I don't personally have a good generic OIDC provider to actually QA it against.

Fixes #98 See also #27

sporkmonger avatar Nov 16 '18 01:11 sporkmonger

Hey Bob - Thanks for this sizable contribution! We really appreciate the work.

This PR also adds the OIDC provider, though I don't personally have a good generic OIDC provider to actually QA it against.

I've only taken a brief look so far, but if the OIDC support is orthogonal to using Azure AD as an identity provider, then we might think about splitting OIDC into a separate PR - This PR is already fairly large, and splitting off the OIDC work will make things easier to digest and review.

Happy to discuss - Thoughts?

katzdm avatar Nov 30 '18 17:11 katzdm

It's not orthogonal. The Azure AD provider uses OIDC under the hood, but it additionally introduces groups support that's not part of OIDC.

sporkmonger avatar Dec 03 '18 20:12 sporkmonger

Just leaving a message of encouragement because I'm really looking forward this 👍

sylr avatar Dec 16 '18 13:12 sylr

@sporkmonger If you move forward with @katzdm reviews I'll will be happy to test it in my organization.

Cheers.

sylr avatar Dec 20 '18 15:12 sylr

Hey, sorry about delay on this. Was fully occupied w/ KubeCon stuff for a bit. I expect I should be able to tackle the review comments within the next week or so.

sporkmonger avatar Dec 21 '18 22:12 sporkmonger

Hey, sorry about delay on this. Was fully occupied w/ KubeCon stuff for a bit. I expect I should be able to tackle the review comments within the next week or so.

No problem - Thanks for pushing forward with this; we're really excited to merge this in! 🎉I'll take another look at this in the next few days.

katzdm avatar Dec 24 '18 01:12 katzdm

Alrighty, good news, I finished up the stuff on my plate that was blocking me from getting back to this. Starting in on the requested changes now. Sorry it's been taking so long.

sporkmonger avatar Jan 08 '19 23:01 sporkmonger

I've addressed all the minor issues. There's a small number of outstanding issues remaining that I expect I should be able to address in the next few days.

sporkmonger avatar Jan 09 '19 02:01 sporkmonger

Wonderful!

sylr avatar Jan 09 '19 08:01 sylr

All changes have been made, this should be ready for re-review @katzdm.

sporkmonger avatar Jan 15 '19 00:01 sporkmonger

@sporkmonger although you implemented the OIDC provider, there's still no way to pass it as a valid provider right? When setting PROVIDER=oidc I got: "error":"unimplemented provider: \"oidc\"" and looking at https://github.com/buzzfeed/sso/blob/e67f1a6c633fb6a3aabed3debda346c83a8fe0ca/internal/auth/options.go seems like it really isn't an option ATM.

eyalzek avatar Jan 16 '19 14:01 eyalzek

@eyalzek Yeah, that's true, it's currently closer to a functional stub than anything. Part of the problem is that Azure AD doesn't quite follow the OIDC spec to the letter and that's what I've got to test with. I was kinda hoping someone with a more compliant OIDC provider available to them might be able to take a look and get that part over the finish line, so that's why I didn't fully enable it.

sporkmonger avatar Jan 16 '19 21:01 sporkmonger

I could certainly turn it on if you'd be willing to test it and work with me on making sure it works in practice, not just in the test suite.

sporkmonger avatar Jan 16 '19 21:01 sporkmonger

My expectation would be that groups are unlikely to work for most OIDC providers that don't have provider-specific implementations though (like this Azure AD PR) since groups have to be passed through in the token and many providers do that in very implementation-specific ways (e.g. Azure AD only does it if certain flags and scopes are set and even then only returns group UUIDs, not group names).

sporkmonger avatar Jan 16 '19 21:01 sporkmonger

After all the changes, looks like it's running into cookie size issues again.

sporkmonger avatar Jan 18 '19 02:01 sporkmonger

I'm going to wait for #137 to land before attempting to resolve.

sporkmonger avatar Jan 18 '19 02:01 sporkmonger

I'll be glad to help with testing. We're currently using dex in front of gitlab as the OIDC provider, but after looking at sso for a bit I'm wondering if dex is even necessary or can it be replaced with the sso-auth component.

eyalzek avatar Jan 18 '19 15:01 eyalzek

I'm working right now on trying to close gaps between sso-auth and dex. There's still some things dex does that sso-auth doesn't, but really depends on your use-case.

sporkmonger avatar Jan 18 '19 19:01 sporkmonger

Just an FYI, to simplify testing you can look at the "production" branch in our fork: https://github.com/Remitly/sso/tree/production

That contains a merged copy of all our outstanding PRs in one place. Since this PR doesn't really work terribly well without #150, it may be easier to test from the merged branch instead.

sporkmonger avatar Jan 29 '19 05:01 sporkmonger

@katzdm @shrayolacrayon Just rebased, let me know if there's anything else in need of updates. Let me know whenever you'd like me to squash.

sporkmonger avatar Jan 31 '19 19:01 sporkmonger

@sporkmonger Do you think it's ready to be tested ? I could give it a go tomorrow.

sylr avatar Jan 31 '19 20:01 sylr

Yes, however, you may need to incorporate the cookie size patches as part of the test.

sporkmonger avatar Jan 31 '19 20:01 sporkmonger

Hey Bob - Thanks for these changes. I just got back from some time off; going to look at this now.

katzdm avatar Feb 05 '19 17:02 katzdm

I've this error:

{"error":"unimplemented provider: \"azure\"","level":"error","msg":"","service":"sso-authenticator","time":"2019-02-07 10:45:08.2710"}
{"error":"unimplemented provider: \"azure\"","level":"error","msg":"error creating new Authenticator","service":"sso-authenticator","time":"2019-02-07 10:45:08.2710"}

sylr avatar Feb 07 '19 10:02 sylr

I managed to get further in the configuration but now I've this when I'm trying to log-in:

{
  "error": "could not verify id_token: oidc: expected audience \"<PROXY_CLIENT_ID>\\n\" got [\"<PROXY_CLIENT_ID>\"]",
  "level": "error",
  "msg": "error redeeming authentication code",
  "remote_address": "217.109.68.123",
  "service": "sso-authenticator",
  "time": "2019-02-07 12:13:05.2712"
}

<PROXY_CLIENT_ID> is really the uuid.

sylr avatar Feb 07 '19 12:02 sylr

Ok So I had to patch go-oidc to move forward : https://github.com/sylr/go-oidc/commit/fe511a3a1419a6ed154e62d08757eb113544613a

Now I've another error:

{
  "error": "could not get groups: api error: {\r\n  \"error\": {\r\n    \"code\": \"Authorization_RequestDenied\",\r\n    \"message\": \"Insufficient privileges to complete the operation.\",\r\n    \"innerError\": {\r\n      \"request-id\": \"658aedf4-04a1-4bcf-bd60-89f13fe10560\",\r\n      \"date\": \"2019-02-07T13:09:48\"\r\n    }\r\n  }\r\n}",
  "level": "error",
  "msg": "error redeeming authentication code",
  "remote_address": "217.109.68.123",
  "service": "sso-authenticator",
  "time": "2019-02-07 13:09:48.271"
}

Here all the permisssions I have given to my service principal

screenshot_38

sylr avatar Feb 07 '19 13:02 sylr

Ok I'm moving forward again. Now I've a redirect_uri problem. Here the flow I'm experiencing for login:

-> (sso-proxy) https://prometheus-sso.mycompany.com/
-> (sso-auth) https://sso-auth.mycompany.com/sign_in?client_id=<cliend_id>&redirect_uri=http://prometheus-sso.mycompany.com/oauth2/callback&...
-> https://sso-auth.mycompany.com/start?redirect_uri=https://sso-auth.mycompany.com/sign_in?client_id=<cliend_id>%0A&redirect_uri=http%3A%2F%2Fprometheus-sso.mycompany.com%2Foauth2%2Fcallback&response_type=code&scope=...
-> https://login.microsoftonline.com/<tenant_di>/oauth2/v2.0/authorize?client_id=<cliend_id>&...
-> https://login.microsoftonline.com/<tenant_di>/reprocess?ctx=...
-> https://login.microsoftonline.com/<tenant_di>/Consent/Set
-> (sso-auth) https://sso-auth.mycompany.com/oauth2/callback

~The last URL gives me a NGINX 502 error page cause I think I'm not supposed to be redirect to the sso-auth at this point but to the sso-proxy (i.e. https://prometheus-sso.mycompany.com/oauth2/callback)~

Here my configs:

sso-auth

  PROVIDER: "azure_v2"
  AZURE_TENANT: "<tenant_id>"
  APPROVAL_PROMPT: "consent"
  SSO_EMAIL_DOMAIN: "mycompany.com"
  PROXY_ROOT_DOMAIN: "mycompany.com"
  REDIRECT_URL: "https://sso-auth.mycompany.com"
  HOST: "sso-auth.mycompany.com"
  STATSD_HOST: "localhost"
  STATSD_PORT: "11111"
  VIRTUAL_HOST: "sso-auth.mycompany.com,sso-proxy-mycompany-com.monitoring.svc.cluster.local"

sso-proxy:

  EMAIL_DOMAIN: "mycompany.com"
  UPSTREAM_CONFIGS: "/conf/upstream_configs.yml"
  PROVIDER_URL: "https://sso-auth.mycompany.com"
  PROVIDER_URL_INTERNAL: "http://sso-proxy-mycompany-com.monitoring.svc.cluster.local"
  STATSD_HOST: "localhost"
  STATSD_PORT: "11111"
  COOKIE_SECURE: "false"
  VIRTUAL_HOST: "*.mycompany.com,*.monitoring.svc.cluster.local"
  CLUSTER: "infra"

sylr avatar Feb 07 '19 16:02 sylr

All right so after a lot of debugging the 502 error I had on https://sso-auth.mycompany.com/oauth2/callback was due to the massive URL size which was larger than the headers size authorized by Nginx which I have in front of sso-auth as reverse proxy.

I'll post the configuration that I had to tune to make it work.

sylr avatar Feb 07 '19 20:02 sylr

That's great to hear. Yeah, I'm using an AWS ALB in front of it, so I didn't hit the nginx issue. Azure uses absolutely massive values for everything so it's really easy to hit limit issues. We're probably contributing to that materially w/ the way we're using the state parameter.

I updated the docs at the top of the PR to use azure_v2 instead of azure.

sporkmonger avatar Feb 07 '19 20:02 sporkmonger

Could you clarify what the issue was with go-oidc that required trimming whitespace? I didn't hit any issues like that.

sporkmonger avatar Feb 07 '19 20:02 sporkmonger