oidc backend - preferred_username not used to provision users, they end up with a random hex username
Expected behaviour
When using the oidc backend to authenticate against Azure AD, users are provisioned with a username based on the preferred_username claim.
Actual behaviour
Users are provisioned with a random hexadecimal username.
What are the steps to reproduce this issue?
I'm reproducing this via NetBox which uses python-social-auth. I don't believe this issue is related to how NetBox configures python-social-auth, but I can dig into any configuration details if it's helpful.
- Create an app registration in Azure AD
- Add a client secret to the app registration
- Configure NetBox's
configuration.pywith:
REMOTE_AUTH_BACKEND = 'social_core.backends.open_id_connect.OpenIdConnectAuth'SOCIAL_AUTH_OIDC_OIDC_ENDPOINT ='https://login.microsoftonline.com/{tenant id}/v2.0'SOCIAL_AUTH_OIDC_KEY = '{app registration id}' - Run
netbox/manage.py runserver - Initiate login by visiting http://localhost:8000/oauth/login/oidc/
- Authenticate with Azure AD & get redirected back to NetBox
- Look at the top right of the screen - the username is in hex.
Any logs, error output, etc?
I patched the social_core.backends.open_id_connect.OpenIDConnectAuth.get_user_details method to produce some debugging output:
def get_user_details(self, response):
username_key = self.setting('USERNAME_KEY', self.USERNAME_KEY)
print('username_key', username_key)
from pprint import pprint
print('response:')
pprint(dict(response.items()))
return {
'username': response.get(username_key),
'email': response.get('email'),
'fullname': response.get('name'),
'first_name': response.get('given_name'),
'last_name': response.get('family_name'),
'groups': response.get('groups'), # not standardized but widely implemented
}
This results in the following output:
username_key preferred_username
response:
{'access_token': '{an access token}',
'email': '[email protected]',
'expires_in': 4008,
'ext_expires_in': 4008,
'family_name': 'Name',
'given_name': 'My',
'id_token': '{an id token}',
'name': 'My Name',
'picture': 'https://graph.microsoft.com/v1.0/me/photo/$value',
'scope': 'profile openid email User.Read',
'sub': '{a sub claim}',
'token_type': 'Bearer'}
If I take the id token from the above and decode it by hand (split into three parts separated by . and base64-decode the 2nd part) then I can see the id token returned by Azure AD. And it does include the preferred_username claim. But the response passed in by social_core.pipeline.social_auth.social_details is missing the preferred_username key.
For the record, the id token contains the following claims:
audissiatnbfexpemailnamenonceoidpreferred_usernamerhsubtidutiver
And the access token contains the following claims:
audemailexpiatissnamenbfnonceoidpreferred_usernamerhsubtidutiver
The ver for both tokens is 2.0.
Any other comments?
I haven't dug through the code yet to see exactly how the response is populated & whether it's simply a case that this claim was never extracted from the id token. I'll continue to dig into it and update this issue with what I find.
But I thought I'd file the issue now, in case someone else who's more familiar with the code can take a quick look and say whether my guess about the preferred_username not being copied out of the id token is correct.
Oh, and I can't just hack NetBox to configure python-social-auth to use the email as the username, because my users don't all have email addresses.
I've figured out that response is the result of fetching an additional v1.0 access token (I don't understand why), with the resulting dict updated with the result of calling the userinfo endpont (that's where the extra fields like given_name come from).
I have hacked this into my NetBox configuration.py file:
REMOTE_AUTH_BACKEND = 'netbox.configuration.OpenIdConnectAuth'
from social_core.backends import open_id_connect
class OpenIdConnectAuth(open_id_connect.OpenIdConnectAuth):
'''
https://github.com/python-social-auth/social-core/issues/709
'''
def get_user_details(self, response):
'''
The stock OpenIdConnectAuth configures response to be the result of the
call to the 'get access token' endpoint, with the result of the call to
the 'get user info' endpoint sprinkled in.
It doesn't include the actual decoded access token or id token
provided by Azure AD!
'''
import jwt as realjwt
try:
decoded_id_token = realjwt.decode(response['id_token'], options={
'verify_signature': False
})
except (realjwt.DecodeError, realjwt.ExpiredSignatureError) as de:
raise AuthTokenError(self, de)
return {
'username': decoded_id_token['preferred_username'],
'email': response.get('email'),
'fullname': response.get('name'),
'first_name': response.get('given_name'),
'last_name': response.get('family_name'),
'groups': response.get('groups'),
}
Which is ugly but it works. I'm trusting the id token handed directly to me from Azure AD is not malicious, hence not validating it (in fact I lifted the code from the AzureADOAuth2.user_data method. Probably this token decode is better done in another method (user_data? But then there's already a validate_and_return_id_token method, which does return the claims from the id token, but it's only called by the request_access_token method, and the decoded claims are not put into response, but instead are saved into a field of the the OpenIDConnectAuth instance!?!) This is all very confusing...
This does the same thing but more cleanly, by relying on the id_token attribute set on the backend instance in its request_access_token method.
REMOTE_AUTH_BACKEND = 'netbox.configuration.OpenIdConnectAuth'
from social_core.backends import open_id_connect
class OpenIdConnectAuth(open_id_connect.OpenIdConnectAuth):
'''
https://github.com/python-social-auth/social-core/issues/709
'''
def get_user_details(self, response):
'''
The stock OpenIdConnectAuth configures response to be the result of the
call to the 'get access token' endpoint, which gives us a v1.0 access
token for some reason. It then mixes in the result of the call to the
'get user info' endpoint.
As a result, 'preferred_username' will never make it into response.
It turns out that OpenIdConnectAuth does decode & validate the original
id token, and stores it as an attribute on itself; it makes no further
use of the id token, but we can use that attribute to obtain values
from the original id token and use them to provision a user with a
username based on the preferred_username claim.
'''
return {
'username': self.id_token['preferred_username'],
'email': response.get('email'),
'fullname': response.get('name'),
'first_name': response.get('given_name'),
'last_name': response.get('family_name'),
'groups': response.get('groups'),
}
I have an idea about why OpenIdConnectAuth receives a v1.0 token instead of a v2.0 token. It turns out there is a hidden property on app registrations, accesTokenAcceptedVersion. You can't display or change the value of this property in the Azure Portal, and it defaults to being unset, which means the app gets a v1.0 token. Good grief.
Once you know what to search for you can find this documented here.
I'm going to try changing the value of this attribute to 2 and then see if my subclass for OpenIdConnectAuth is no longer necessary. If so I'll close this issue. Though it still seems a bit weird how the access token claims are put into the response and the identity token claims aren't used. But I probably need an OpenID Connect expert to take a look and give me the answer... ;)
Well, even after setting accesTokenAcceptedVersion to 2, the access_token in the response is a v1.0 access token!
Not that it actually matters anyway -- on closer inspection the claims from response["access_token"] aren't decoded & copied to reponse after all. That was tired me talking. response is the result of calling the userinfo endpoint, + some other stuff.
So there's no way that preferred_username will ever get into response. So any code that expects it to be there will never find it.
However it's easy enough to override the class as above in order to access the decoded claims via self.id_token. I think my modified get_user_details method is usable as is, I wonder if someone who knows more about OAuth/OpenID Connect can comment.
An alternative could be: copy claims from id_token into response before the values from the 'get user info' endpoint are copied in. That way the get_user_details method doesn't need to be changed, as it will have access to both the claims from the id token & the result of the 'get user info' endpoint.
Note to self: there's an Azure AD specific class available, social_core.backends.azuread_tenant.AzureADV2TenantOAuth2 which might be better to override. It uses the preferred_username claim for the user ID. But it also uses the name claim for the username(?) and it incorrectly also uses perferred_usermail for email, instead of email.
Hi @yrro - were you able to solve your issue? I encounter somewhat similar - the roles nor groups claim is not recognized at all when I use SOCIAL_AUTH_OIDC_OIDC_ENDPOINT = 'https://login.microsoftonline.com/{redacted_mytenantid}/v2.0'' - which is the right endpoint provided by App Registration docs.
But when I set it to https://login.microsoftonline.com/{redacted_mytenantid} I see roles and groups (I've set the App roles according to https://learn.microsoft.com/en-us/entra/identity-platform/howto-add-app-roles-in-apps
The main difference is of course 'ver': '1.0' vs 'ver': '2.0' so this corresponds to https://nicolgit.github.io/AzureAD-Endopoint-V1-vs-V2-comparison/
I haven't had a chance to take another look at this. According to [ID token claims reference], the roles claim is always present in both v1.0 and v2.0 tokens. According to Optional claims reference the groups claim is only present in v1.0 or v2.0 tokens if the app registration is configured to include it.
Hello
I had the same issue but I found in the source code that we can specify the username key in configuration.py
SOCIAL_AUTH_OIDC_USERNAME_KEY = "email"
This is fixing the username with the email name. I tried also other parameter to see if it works and yes.
def get_user_details(self, response): username_key = self.setting("USERNAME_KEY", self.USERNAME_KEY) return { "username": response.get(username_key), "email": response.get("email"), "fullname": response.get("name"), "first_name": response.get("given_name"), "last_name": response.get("family_name"), }
https://github.com/python-social-auth/social-core/blob/d7bba223c0036581b63b01d05e53b115c606dbec/social_core/backends/open_id_connect.py#L264
Hope it helps