pyjwt
pyjwt copied to clipboard
The JWKS endpoint did not return a JSON object when keys are found in the cache
When the jwks_client.get_jwk_set
function finds the keys in the cache, it gets a PyJWKSet object, if not, it performs a fech and gets a JSON object. In both cases it checks if the data is a Dictionary, causing the check to fail when the data was retrieved from the cache.
https://github.com/jpadilla/pyjwt/blob/95638cf04f83b3e2c289b28810501e53195ff938/jwt/jwks_client.py#L68
The right code would return the object found in the cache.
would you mind contribute the fix please?
I've further investigated why the current tests do not raise this error. Put the conclusions in the PR, but I'm adding them here for easier access to anyone falling into the issue.
I've investigated why it does not, and the root cause seems to be that in fetch_data, the cache is not fed with a PyJWKSet but with the dict directly read from the URL. As the cache is not the expected type but a dict, the get_jwk_set function obtains a dict from the cache, and the test passes.
In my case I was feeding the cache in the client directly from an externally stored JWKS, to avoid the fetch from the URL. To do so, I manually created the PyJWKSetCache object and fed it with a PyJWKSet. That's why in my case the cache contained the expected data type and my code failed with the error.
Heres's the result of my code inspection:
First time the get_jwk_set is called, it will get None from the cache and call fetch_data. This is what in my opinion does not honor the declared data types in the type hints:
def fetch_data(self) -> Any:
jwk_set: Any = None
try:
r = urllib.request.Request(url=self.uri, headers=self.headers)
with urllib.request.urlopen(r, timeout=self.timeout) as response:
jwk_set = json.load(response)
except (URLError, TimeoutError) as e:
raise PyJWKClientConnectionError(
f'Fail to fetch data from the url, err: "{e}"'
)
else:
return jwk_set
finally:
if self.jwk_set_cache is not None:
self.jwk_set_cache.put(jwk_set) --> This puts a JSON Dict in jwk_set
Then the JWKSetCache put method that should receive a PyJWKSet as per the type hint will receive a dictionary:
class JWKSetCache:
def __init__(self, lifespan: int) -> None:
self.jwk_set_with_timestamp: Optional[PyJWTSetWithTimestamp] = None
self.lifespan = lifespan
def put(self, jwk_set: PyJWKSet) -> None: --> BUT this expects not a Dict but a PyJWKSet (not a class of Dict)
if jwk_set is not None:
self.jwk_set_with_timestamp = PyJWTSetWithTimestamp(jwk_set) --> The Dict goes in constructing the PyJWTSetWithTimestamp
else:
# clear cache
self.jwk_set_with_timestamp = None
And that dictionary will find its way through to the final object stored in the cache:
class PyJWTSetWithTimestamp:
def __init__(self, jwk_set: PyJWKSet):
self.jwk_set = jwk_set ---> The Dict ends as the value in the cached element
self.timestamp = time.monotonic()
def get_jwk_set(self) -> PyJWKSet:
return self.jwk_set --> This will NOT return a PyJWKSet but a Dict
def get_timestamp(self) -> float:
return self.timestamp
Then, the JWKSetCache class will blindly return the dict instead of the PyJWKSet.
class JWKSetCache:
def __init__(self, lifespan: int) -> None:
self.jwk_set_with_timestamp: Optional[PyJWTSetWithTimestamp] = None
self.lifespan = lifespan
def put(self, jwk_set: PyJWKSet) -> None:
if jwk_set is not None:
self.jwk_set_with_timestamp = PyJWTSetWithTimestamp(jwk_set)
else:
# clear cache
self.jwk_set_with_timestamp = None
def get(self) -> Optional[PyJWKSet]:
if self.jwk_set_with_timestamp is None or self.is_expired():
return None
return self.jwk_set_with_timestamp.get_jwk_set() ---> The cache so, returns a Dict not a PyJWKSet
And finally the get_jwk_set always gets a dict wether it takes the data from the URL or from the cache
def get_jwk_set(self, refresh: bool = False) -> PyJWKSet:
data = None
if self.jwk_set_cache is not None and not refresh:
data = self.jwk_set_cache.get() ---> So here we are getting a Dict, despite this function code is wrong, it works
if data is None:
data = self.fetch_data()
if not isinstance(data, dict):
raise PyJWKClientError("The JWKS endpoint did not return a JSON object")
return PyJWKSet.from_dict(data)
Any code expecting to receive the PyJWKSet type from the cache, or any code which as in my case feeds the cache in other way than through a fetch_data call will fail.
The code in the PR fixed the problem declared in this issue without impacting any other current use of the module, as both the received and returned types of the function are preserved and no other parts of the module are changed. However, it does no fix use cases that use fetch_data and access the cache directly instead of through the fixed get_jwk_set.
Additionally to the tests passing because of the casual coherence between fetch_data and get_jwk_set in putting a dictionary in the cache, I've found because of how test https://github.com/jpadilla/pyjwt/blob/95638cf04f83b3e2c289b28810501e53195ff938/tests/test_jwks_client.py#L222 is coded and the use of mock, the second call to get_signing_key never really executes the method, but directly returns the same PyJWT object. The assertion being that the mocked urlopen method is not called succeeds but without really having run the code a second time.
This is most probably happening, though I did not test them all, with all the other test functions that involve two calls to the same method of the same object with the same parameter values.
Adding a new test like this, produces the expected result of reproducing this issue:
RESPONSE_DATA_WITH_MATCHING_KID = {
"keys": [
{
"alg": "RS256",
"kty": "RSA",
"use": "sig",
"n": "0wtlJRY9-ru61LmOgieeI7_rD1oIna9QpBMAOWw8wTuoIhFQFwcIi7MFB7IEfelCPj08vkfLsuFtR8cG07EE4uvJ78bAqRjMsCvprWp4e2p7hqPnWcpRpDEyHjzirEJle1LPpjLLVaSWgkbrVaOD0lkWkP1T1TkrOset_Obh8BwtO-Ww-UfrEwxTyz1646AGkbT2nL8PX0trXrmira8GnrCkFUgTUS61GoTdb9bCJ19PLX9Gnxw7J0BtR0GubopXq8KlI0ThVql6ZtVGN2dvmrCPAVAZleM5TVB61m0VSXvGWaF6_GeOhbFoyWcyUmFvzWhBm8Q38vWgsSI7oHTkEw",
"e": "AQAB",
"kid": "NEE1QURBOTM4MzI5RkFDNTYxOTU1MDg2ODgwQ0UzMTk1QjYyRkRFQw",
"x5t": "NEE1QURBOTM4MzI5RkFDNTYxOTU1MDg2ODgwQ0UzMTk1QjYyRkRFQw",
"x5c": [
"MIIDBzCCAe+gAwIBAgIJNtD9Ozi6j2jJMA0GCSqGSIb3DQEBCwUAMCExHzAdBgNVBAMTFmRldi04N2V2eDlydS5hdXRoMC5jb20wHhcNMTkwNjIwMTU0NDU4WhcNMzMwMjI2MTU0NDU4WjAhMR8wHQYDVQQDExZkZXYtODdldng5cnUuYXV0aDAuY29tMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0wtlJRY9+ru61LmOgieeI7/rD1oIna9QpBMAOWw8wTuoIhFQFwcIi7MFB7IEfelCPj08vkfLsuFtR8cG07EE4uvJ78bAqRjMsCvprWp4e2p7hqPnWcpRpDEyHjzirEJle1LPpjLLVaSWgkbrVaOD0lkWkP1T1TkrOset/Obh8BwtO+Ww+UfrEwxTyz1646AGkbT2nL8PX0trXrmira8GnrCkFUgTUS61GoTdb9bCJ19PLX9Gnxw7J0BtR0GubopXq8KlI0ThVql6ZtVGN2dvmrCPAVAZleM5TVB61m0VSXvGWaF6/GeOhbFoyWcyUmFvzWhBm8Q38vWgsSI7oHTkEwIDAQABo0IwQDAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBQlGXpmYaXFB7Q3eG69Uhjd4cFp/jAOBgNVHQ8BAf8EBAMCAoQwDQYJKoZIhvcNAQELBQADggEBAIzQOF/h4T5WWAdjhcIwdNS7hS2Deq+UxxkRv+uavj6O9mHLuRG1q5onvSFShjECXaYT6OGibn7Ufw/JSm3+86ZouMYjBEqGh4OvWRkwARy1YTWUVDGpT2HAwtIq3lfYvhe8P4VfZByp1N4lfn6X2NcJflG+Q+mfXNmRFyyft3Oq51PCZyyAkU7bTun9FmMOyBtmJvQjZ8RXgBLvu9nUcZB8yTVoeUEg4cLczQlli/OkiFXhWgrhVr8uF0/9klslMFXtm78iYSgR8/oC+k1pSNd1+ESSt7n6+JiAQ2Co+ZNKta7LTDGAjGjNDymyoCrZpeuYQwwnHYEHu/0khjAxhXo="
],
},
{
"alg": "RS256",
"kty": "RSA",
"use": "sig",
"n": "0wtlJRY9-ru61LmOgieeI7_rD1oIna9QpBMAOWw8wTuoIhFQFwcIi7MFB7IEfelCPj08vkfLsuFtR8cG07EE4uvJ78bAqRjMsCvprWp4e2p7hqPnWcpRpDEyHjzirEJle1LPpjLLVaSWgkbrVaOD0lkWkP1T1TkrOset_Obh8BwtO-Ww-UfrEwxTyz1646AGkbT2nL8PX0trXrmira8GnrCkFUgTUS61GoTdb9bCJ19PLX9Gnxw7J0BtR0GubopXq8KlI0ThVql6ZtVGN2dvmrCPAVAZleM5TVB61m0VSXvGWaF6_GeOhbFoyWcyUmFvzWhBm8Q38vWgsSI7oHTkEw",
"e": "AQAB",
"kid": "NEE1QURBOTM4MzI5RkFDNTYxOTU1MDg2ODgwQ0UzMTk1QjYyRkRFQwKKK",
"x5t": "NEE1QURBOTM4MzI5RkFDNTYxOTU1MDg2ODgwQ0UzMTk1QjYyRkRFQw",
"x5c": [
"MIIDBzCCAe+gAwIBAgIJNtD9Ozi6j2jJMA0GCSqGSIb3DQEBCwUAMCExHzAdBgNVBAMTFmRldi04N2V2eDlydS5hdXRoMC5jb20wHhcNMTkwNjIwMTU0NDU4WhcNMzMwMjI2MTU0NDU4WjAhMR8wHQYDVQQDExZkZXYtODdldng5cnUuYXV0aDAuY29tMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0wtlJRY9+ru61LmOgieeI7/rD1oIna9QpBMAOWw8wTuoIhFQFwcIi7MFB7IEfelCPj08vkfLsuFtR8cG07EE4uvJ78bAqRjMsCvprWp4e2p7hqPnWcpRpDEyHjzirEJle1LPpjLLVaSWgkbrVaOD0lkWkP1T1TkrOset/Obh8BwtO+Ww+UfrEwxTyz1646AGkbT2nL8PX0trXrmira8GnrCkFUgTUS61GoTdb9bCJ19PLX9Gnxw7J0BtR0GubopXq8KlI0ThVql6ZtVGN2dvmrCPAVAZleM5TVB61m0VSXvGWaF6/GeOhbFoyWcyUmFvzWhBm8Q38vWgsSI7oHTkEwIDAQABo0IwQDAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBQlGXpmYaXFB7Q3eG69Uhjd4cFp/jAOBgNVHQ8BAf8EBAMCAoQwDQYJKoZIhvcNAQELBQADggEBAIzQOF/h4T5WWAdjhcIwdNS7hS2Deq+UxxkRv+uavj6O9mHLuRG1q5onvSFShjECXaYT6OGibn7Ufw/JSm3+86ZouMYjBEqGh4OvWRkwARy1YTWUVDGpT2HAwtIq3lfYvhe8P4VfZByp1N4lfn6X2NcJflG+Q+mfXNmRFyyft3Oq51PCZyyAkU7bTun9FmMOyBtmJvQjZ8RXgBLvu9nUcZB8yTVoeUEg4cLczQlli/OkiFXhWgrhVr8uF0/9klslMFXtm78iYSgR8/oC+k1pSNd1+ESSt7n6+JiAQ2Co+ZNKta7LTDGAjGjNDymyoCrZpeuYQwwnHYEHu/0khjAxhXo="
],
}
]
}
[...]
def test_get_signing_key_caches_result(self):
url = "https://dev-87evx9ru.auth0.com/.well-known/jwks.json"
kid = "NEE1QURBOTM4MzI5RkFDNTYxOTU1MDg2ODgwQ0UzMTk1QjYyRkRFQw"
jwks_client = PyJWKClient(url, cache_keys=True)
with mocked_success_response(RESPONSE_DATA_WITH_MATCHING_KID) as first_call:
mresult = jwks_client.get_signing_key(kid)
assert mresult.key_id == kid
assert first_call.call_count == 1
# mocked_response does not allow urllib.request.urlopen to be called twice
# so a second mock is needed
with mocked_success_response(RESPONSE_DATA_WITH_MATCHING_KID) as repeated_call:
result2 = jwks_client.get_signing_key(kid+"KKK")
assert result2.key_id == kid+"KKK"
assert repeated_call.call_count == 0
This test fails with the error mentioned in this issue using the existing code, and works with the code in the PR. Will add this change to the PR.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days