microsoft-authentication-library-for-python icon indicating copy to clipboard operation
microsoft-authentication-library-for-python copied to clipboard

example for web api token validation

Open ctaggart opened this issue 4 years ago • 34 comments

I need to validate the identity using a Flask web API. It is a different flow than the ms-identity-python-webapp example. I think I just need to validate the jwt token and some of its values similar to what is done here with this code. Is this something that can be done with msal or am I better off using code like that?

Client credentials flow

ctaggart avatar Jan 29 '20 14:01 ctaggart

Hi Cameron, thank you for your valuable question! You clearly did your research. :-)

So, what you want is the token validation. And that is NOT the same as "client credentials flow". For what it's worth, client credentials flow is still about the token acquisition, not token validation.

Regardless, we do not currently have a sample for token validation. I'm marking this as an enhancement item here, and we will take this into consideration when next time we update our plan.

rayluo avatar Jan 29 '20 18:01 rayluo

They you Ray. Yes, token validation is what I'm looking for.

ctaggart avatar Jan 29 '20 20:01 ctaggart

+1 for token example

gwsampso avatar Feb 05 '20 04:02 gwsampso

Hi ,

This is not bug , but needs a help on how to do this feature to github . I have a web app (Flask ) A created by following this tutorial. Link And I have another website B hosted in another host provider and trying to access above api from B. I added CORS in A but want to impliment an oAuth2 in azure . how to do that? I am looking a scenario where credentials prompt is not shown . like a deamon app it should be able to get the token. I am new to all these. Hope you got what am i looking into.

dsjijo avatar Mar 24 '20 05:03 dsjijo

any example also would do. Thanks.

dsjijo avatar Mar 24 '20 05:03 dsjijo

Hi All I have found a repository doing the same Link .

The token I was able to get through following this link and restructed my app like the secureFlaskApp.

Thanks for the help.

dsjijo avatar Mar 25 '20 17:03 dsjijo

A sufficient number of JWT validation checks is being performed in the msal.oauth2cli.oidc.decode_id_token(), which is called upon adding tokens into TokenCache: token_cache.py:137. But these checks do not include signature verification, [update: which is not necessary when obtaining tokens directly from the AAD server over SSL, e.g. via .acquire_token_by_authorization_code()].

Also, msal depends on pyjwt library, which contains API method for full JWT validation. The method requires AAD public key, so here is the way to call it [update: for ID tokens]:

  1. Load OpenID configuration from https://login.microsoftonline.com/common/v2.0/.well-known/openid-configuration
  2. Use the jwks_uri endpoint to load AAD public keys (currently https://login.microsoftonline.com/common/discovery/v2.0/keys).
  3. Take the key that corresponds to "kid" field value of JWT header.
  4. Base64-decode the value of key's "x5c" field and decode it as X.509 certificate in DER format. Convert its public key part into PEM format.
  5. Call jwt.decode(itoken, public_key, audience=<client_id>), supplying client_id of your application, and catch exceptions that it can raise.
from base64 import b64decode
import jwt
from cryptography import x509
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
import requests

jwks_uri = 'https://login.microsoftonline.com/common/discovery/v2.0/keys'
jwkeys = requests.get(jwks_uri).json()['keys']

token_key_id = jwt.get_unverified_header(token)['kid']
jwk = [key for key in jwkeys if key['kid'] == token_key_id][0]
der_cert = b64decode(jwk['x5c'][0])
cert = x509.load_der_x509_certificate(der_cert, default_backend())
public_key = cert.public_key()
pem_key = public_key.public_bytes(encoding=serialization.Encoding.PEM, format=serialization.PublicFormat.SubjectPublicKeyInfo)
token_claims = jwt.decode(token, pem_key, audience=client_id)

update: This method may fail for access tokens, because they might be issued for another audience (e.g. Microsoft Graph API) and signed with audience-specific key.

eprigorodov avatar Nov 05 '20 13:11 eprigorodov

Hi @eprigorodov , thank you for the code review on our existing implementation. We love this community voices. :-)

With regard to the topic you brought up above, it is actually a different topic than the current issue. The current issue is about Access Token validation, the topic you brought up is about ID Token validation. MSAL already performs ID token validation, we just validate it in a different-than-pyjwt way, but still specs-compliant.

Should you have follow-up question on ID token validation, please create ANOTHER issue for its subsequent discussion.

rayluo avatar Nov 05 '20 20:11 rayluo

Would be great to have functionality to verify access_token sent from the FE. It would seem like this functionality would make sense given the Angular and upcoming React libraries in @azure/msal-browser. Those access_token need to be validated by your API whether it is Express, Django/DRF, etc. somehow and it is not clear how to do this in this library's current state.

cheslijones avatar Nov 07 '20 16:11 cheslijones

@rayluo, thank you for pointing out, indeed I was looking for an issue about ID token validation and misread the actual subject.

@cheslijones, access tokens issued for Microsoft APIs are usually supposed to be opaque for the client and will be validated by the very endpoint service. Clients just send them as-is to the target API and process responses with possible validation errors. msal.TokenCache will take care about expiration and refresh, if used in a recommended way (see point 2. of the usage guide).

eprigorodov avatar Nov 09 '20 19:11 eprigorodov

@eprigorodov It sounds like the library does not support my use case.

cheslijones avatar Nov 09 '20 19:11 cheslijones

@cheslijones The MSAL is a client authentication library. Server middleware libraries are listed here: https://docs.microsoft.com/en-us/azure/active-directory/develop/reference-v2-libraries#microsoft-supported-server-middleware-libraries

For Django you can try authlib. It uses the same approach as the code in the comment above. JWT signature verification will work if access token is signed with one of "generic" Microsoft keys, the ones exposed via well-known OpenID configuration.

For that authorization request, token request and access token should include one of scopes defined for the web application itself (listed under "Scopes defined by this API" on the blade "Expose an API" of the Azure App Registration).

eprigorodov avatar Nov 11 '20 09:11 eprigorodov

No caso procuro a mesma solução em relação a validação do token. Pois no caso quem vai conectar com o usuário é o front-end (web) e manter o usuário conectado para cada solicitação que ele fizer na api do back-end, mas ao passar para o back-end a requisição, é necessário que ele valide o token para saber se é daquele AD especifico usando o APPID e TENANTID, caso contrário essa conexão com o AD teria que ser feito pelo back-end, que não faz nenhum sentido, quando a idéia é fazer login usando credencias microsoft e manter o usuário conectado pelo tempo de expiração do access token.

GuiLuccaDev avatar Nov 11 '20 19:11 GuiLuccaDev

Of course, there are reference solutions out there as mentioned above. And yes, this is a client authentication library, but the recommended most secure flow is the authorization code flow, which requires this to be run on the server in order to have control of how you issue tokens to the clients (client secrets).

I think this makes it a very suitable place to include a def validate_token(self, audience,...) -> DecodedToken: somewhere in the class ClientApplication(object): which then can be included into any middleware, but then at least the implementation is right there for the use, and potential security or performance impacting bugs in an area as critical as the validation of the tokens (performed on all requests) is avoided in the multitude of servers using the authorization code flow (or any other implementation that requires the token acquisition and validation to happen in the same application).

I think including this feature in the library would be great for us users and will mitigate potential vulnerabilities of improper validation by everyone re-implementing reference solutions and making mistakes.

marchom avatar Dec 14 '20 13:12 marchom

Hello guys,

I have the same problem, @rayluo any evolution at that point?

guicarvalho avatar Feb 11 '21 14:02 guicarvalho

Sorry, not yet. :-(

rayluo avatar Feb 11 '21 23:02 rayluo

hello guys,

I too have same problem, @rayluo any evolution?

Priyanka9496 avatar Mar 15 '21 12:03 Priyanka9496

hey, I found a good example for the token validation that also uses the on behalf of flow to let the API communicate with the Microsoft API's. https://github.com/Azure-Samples/ms-identity-python-on-behalf-of

would be awesome if msal could have the functionality for validation, it was a pain to figure out :3

tjeerddie avatar May 05 '21 11:05 tjeerddie

A sufficient number of JWT validation checks is being performed in the msal.oauth2cli.oidc.decode_id_token(), which is called upon adding tokens into TokenCache: token_cache.py:137. But these checks do not include signature verification, [update: which is not necessary when obtaining tokens directly from the AAD server over SSL, e.g. via .acquire_token_by_authorization_code()].

Also, msal depends on pyjwt library, which contains API method for full JWT validation. The method requires AAD public key, so here is the way to call it [update: for ID tokens]:

1. Load OpenID configuration from https://login.microsoftonline.com/common/v2.0/.well-known/openid-configuration

2. Use the `jwks_uri` endpoint to load AAD public keys (currently https://login.microsoftonline.com/common/discovery/v2.0/keys).

3. Take the key that corresponds to `"kid"` field value of JWT header.

4. Base64-decode the value of key's `"x5c"` field and decode it as X.509 certificate in DER format. Convert its public key part into PEM format.

5. Call `jwt.decode(itoken, public_key, audience=<client_id>)`, supplying `client_id` of your application, and catch exceptions that it can raise.
from base64 import b64decode
import jwt
from cryptography import x509
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
import requests

jwks_uri = 'https://login.microsoftonline.com/common/discovery/v2.0/keys'
jwkeys = requests.get(jwks_uri).json()['keys']

token_key_id = jwt.get_unverified_header(token)['kid']
jwk = [key for key in jwkeys if key['kid'] == token_key_id][0]
der_cert = b64decode(jwk['x5c'][0])
cert = x509.load_der_x509_certificate(der_cert, default_backend())
public_key = cert.public_key()
pem_key = public_key.public_bytes(encoding=serialization.Encoding.PEM, format=serialization.PublicFormat.SubjectPublicKeyInfo)
token_claims = jwt.decode(token, pem_key, audience=client_id)

update: This method may fail for access tokens, because they might be issued for another audience (e.g. Microsoft Graph API) and signed with audience-specific key.

Hi I am using your code to decode client side token given by teams to tab. I have also raised my query here : https://stackoverflow.com/questions/67401139/using-python-decode-client-side-token-fetched-by-microsoft-teams-and-given-to-ta

I am getting this error jwt.exceptions.InvalidAudienceError: Invalid audience

Is there anything that you can help me with? I used Application Id(when we register app in azure active directory) as client_id.

datasleek avatar May 05 '21 12:05 datasleek

A sufficient number of JWT validation checks is being performed in the msal.oauth2cli.oidc.decode_id_token(), which is called upon adding tokens into TokenCache: token_cache.py:137. But these checks do not include signature verification, [update: which is not necessary when obtaining tokens directly from the AAD server over SSL, e.g. via .acquire_token_by_authorization_code()]. Also, msal depends on pyjwt library, which contains API method for full JWT validation. The method requires AAD public key, so here is the way to call it [update: for ID tokens]:

1. Load OpenID configuration from https://login.microsoftonline.com/common/v2.0/.well-known/openid-configuration

2. Use the `jwks_uri` endpoint to load AAD public keys (currently https://login.microsoftonline.com/common/discovery/v2.0/keys).

3. Take the key that corresponds to `"kid"` field value of JWT header.

4. Base64-decode the value of key's `"x5c"` field and decode it as X.509 certificate in DER format. Convert its public key part into PEM format.

5. Call `jwt.decode(itoken, public_key, audience=<client_id>)`, supplying `client_id` of your application, and catch exceptions that it can raise.
from base64 import b64decode
import jwt
from cryptography import x509
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
import requests

jwks_uri = 'https://login.microsoftonline.com/common/discovery/v2.0/keys'
jwkeys = requests.get(jwks_uri).json()['keys']

token_key_id = jwt.get_unverified_header(token)['kid']
jwk = [key for key in jwkeys if key['kid'] == token_key_id][0]
der_cert = b64decode(jwk['x5c'][0])
cert = x509.load_der_x509_certificate(der_cert, default_backend())
public_key = cert.public_key()
pem_key = public_key.public_bytes(encoding=serialization.Encoding.PEM, format=serialization.PublicFormat.SubjectPublicKeyInfo)
token_claims = jwt.decode(token, pem_key, audience=client_id)

update: This method may fail for access tokens, because they might be issued for another audience (e.g. Microsoft Graph API) and signed with audience-specific key.

Hi I am using your code to decode client side token given by teams to tab. I have also raised my query here : https://stackoverflow.com/questions/67401139/using-python-decode-client-side-token-fetched-by-microsoft-teams-and-given-to-ta

I am getting this error jwt.exceptions.InvalidAudienceError: Invalid audience

Is there anything that you can help me with? I used Application Id(when we register app in azure active directory) as client_id.

Hi,

I decoded the token again in jwt.ms and find the aud paramter and used that value as audience to decode the token_claims again. It was successful.

datasleek avatar May 05 '21 13:05 datasleek

Hi @datasleek, as per documentation, the audience "aud" claim in ID token is either Application (client) ID or Application ID URI, e.g.:

'3513283e-1abe-420c-8de0-7415d2d26ae0'
'api://3513283e-1abe-420c-8de0-7415d2d26ae0'

Application ID URI can also be set in the Azure Portal interface, on the "Expose an API" blade of Application Registration object.

The rules for audience claims in access tokens are more complex, they depend on requested scopes. For example, tokens issued for Microsoft Graph scopes may conatin a magical audience OID "00000003-0000-0000-c000-000000000000".

It is possible also to turn off audience verification in jwt: jwt.decode(id_token, key, options={'verify_aud': False}).

eprigorodov avatar May 05 '21 16:05 eprigorodov

Hi @datasleek, as per documentation, the audience "aud" claim in ID token is either Application (client) ID or Application ID URI, e.g.:

'3513283e-1abe-420c-8de0-7415d2d26ae0'
'api://3513283e-1abe-420c-8de0-7415d2d26ae0'

Application ID URI can also be set in the Azure Portal interface, on the "Expose an API" blade of Application Registration object.

The rules for audience claims in access tokens are more complex, they depend on requested scopes. For example, tokens issued for Microsoft Graph scopes may conatin a magical audience OID "00000003-0000-0000-c000-000000000000".

It is possible also to turn off audience verification in jwt: jwt.decode(id_token, key, options={'verify_aud': False}).

Hi @eprigorodov ,

I apologise for late reply. Your code helped me to successfully decode the clientSideToken fetched by microsoft teams from AAD . Regarding aud yes you are correct. I also got help about the same on stackoverflow. Thanks for your reply and for the code.

datasleek avatar May 09 '21 19:05 datasleek

Hi @datasleek, as per documentation, the audience "aud" claim in ID token is either Application (client) ID or Application ID URI, e.g.:

'3513283e-1abe-420c-8de0-7415d2d26ae0'
'api://3513283e-1abe-420c-8de0-7415d2d26ae0'

Application ID URI can also be set in the Azure Portal interface, on the "Expose an API" blade of Application Registration object.

The rules for audience claims in access tokens are more complex, they depend on requested scopes. For example, tokens issued for Microsoft Graph scopes may conatin a magical audience OID "00000003-0000-0000-c000-000000000000".

It is possible also to turn off audience verification in jwt: jwt.decode(id_token, key, options={'verify_aud': False}).

Hi @eprigorodev I used options dict with access_token but it is not working. Hope key in above method is the public key. Also, if we use jwt.io to check access_token, there also its signature is invalid. While for the id_token, it gives signature verified. May be access_token are not to be dealt this way ? Please suggest solution for access_token.

doverk96 avatar Jun 10 '21 11:06 doverk96

For anyone else searching for this, I have found this example: https://github.com/ivangeorgiev/gems/tree/main/src/python-azure-ad-token-validate which validates access token.

RamunasAdamonis avatar Sep 09 '21 07:09 RamunasAdamonis

@rayluo, is there currently a plan to add token validation to this library? I understand it's not necessary to validate tokens for the graph API. However, I'm currently using this library to obtain tokens for my own API, by setting the scopes to point to my app registration id. The access token that my client app obtains is subsequently sent along with requests to the API, so I need to validate it in that API. I'm working with both Azure AD and B2C because my organization supports different login methods, and I found out those tokens need to be validated differently (the JWKS are different). I implemented my own code to validate the tokens, but it was quite difficult to figure out how to do that, and the MS documentation wasn't helpful in that regard. It would be great if this library could be used server-side to handle signature validation.

Is there any way I can contribute to such a feature?

I think the previously mentioned examples are helpful for AD validation. For B2C, I found this article helpful: https://robertoprevato.github.io/Validating-JWT-Bearer-tokens-from-Azure-AD-in-Python/. For anyone using Fast API, I recommend this library: https://github.com/Intility/fastapi-azure-auth

robteeuwen avatar Jan 14 '22 14:01 robteeuwen

I implemented my own code to validate the tokens, but ... Is there any way I can contribute to such a feature?

Feel free to share a link to your implementation, so that we (Microsoft and the entire community) can utilize it.

We have not decided whether token validation feature should be part of this MSAL library, but it doesn't harm to consolidate some working code as a sample. In fact, some other Microsoft libraries/components were initially started as a sample, and evolve from there.

rayluo avatar Jan 14 '22 20:01 rayluo

CC @jennyf19 @jmprieur for token validation requests in PY

bgavrilMS avatar Aug 17 '23 18:08 bgavrilMS

This repo from MS may be useful for someone reading this in search of a FastAPI plugin or a generic reference solution to validate access tokens. It includes a decorator function (one liner above the method serving an api endpoint) for FastAPI, but the source code is similar to other references like mentioned above.

A few thoughts on that reference:

  1. Consider using the fork containing a not yet merged PR adding a missing return statement that helps debugging.
  2. The other PR regarding multi-tenancy support may or may not have security implications.
  3. Something that are mentioned other places and that may help debugging, is that an azure app registration may specify the token version, otherwise it will default to 1. You'll find this in Azure/App Reg//Manifest "accessTokenAcceptedVersion" which can be set to null/1/2. This is relevant for debugging errors in validating audience/issuer as the expected format varies with token version, as seen in auth_service.

@rayluo How would you recommend approaching this now for access token validation in a custom api? Using something from msal-python, a reference implementation, or authlib/pyjwt etc ?

pschiager avatar Oct 26 '23 08:10 pschiager

This drives me crazy, any helps appreciated. X5eXk4xyojNFum1kl2Ytv8dlNP4-c57dO6QGTVBwaNk is from my token header kid field.

unverified_header = jwt.get_unverified_header(token)         
print(unverified_header["kid"])

T1St-dLTvyWRgxB_676u8krXS-I 5B3nRxtQ7ji8eNDc3Fy05Kf97ZE fwNs8F_h9KrHMA_OcTP1pGFWwyc whLKSpgJ1osmGxXH6PbiDyBjoxE jCScSBGaA6xAieLw3-sAuvjRM_0

from kid field of below jsonurl = urlopen("https://login.microsoftonline.com/common/discovery/v2.0/keys");

They are totally different in length, none of them match.

jsonurl = urlopen("https://login.microsoftonline.com/440bfb1c-65d8-489c-a830-f1a118d83066/discovery/keys?p=B2C_1_user_signin_signup") I also tried this, since I am using custom policy, I can't fetch any keys since it return 400X

cutesweetpudding avatar Jan 11 '24 10:01 cutesweetpudding

@cutesweetpudding - the jwks_uri is found by going to the well-known metadata endpoint of the identity provider. This is typically the authority + ".well-known/openid-configuration". For example

https://login.microsoftonline.com/common/.well-known/openid-configuration

bgavrilMS avatar Jan 11 '24 11:01 bgavrilMS