jwks-rsa-java
jwks-rsa-java copied to clipboard
Bad design of jwks cache
Describe the problem you'd like to have solved
- Users may affect each other calling the server, that under the hood uses JwkProvider to validate the tokens. See the code below:
fun main() {
val jwksUrl = "https://login.microsoftonline.com/common/discovery/keys".let {
URI(it).normalize().toURL()
}
val foundKey = UrlJwkProvider(jwksUrl).all.first().id
val notFoundKey = UUID.randomUUID().toString()
val jwkProvider =
JwkProviderBuilder(jwksUrl)
.cached(100, 10, TimeUnit.MINUTES)
.rateLimited(10, 1, TimeUnit.MINUTES)
.build()
/*
someone requesting server with JWT, but which has invalid kid
*/
repeat(11) {
assertThrows<SigningKeyNotFoundException> {
jwkProvider.get(notFoundKey)
}
}
/*
someone would like to request the server with valid JWT and will get rate limit error
*/
jwkProvider.get(foundKey)
}
- If oauth2 server has several jwks (with different kid and algorithms) then library will call jwks endpoint as many times as there are keys. https://github.com/auth0/jwks-rsa-java/blob/master/src/main/java/com/auth0/jwk/UrlJwkProvider.java#L163
Describe the ideal solution
- Rate limit should not be there at all - it's responsibility of upper layer
- It's more preferable to cache all jwks at once, not one by one
I think the strategy should be the following: cache all jwks response from server and return found jwk by kid. Cache should be update by time.
Alternatives and current work-arounds
class CachedJwkProvider(
private val delegate: UrlJwkProvider,
private val expiration: Duration
) : JwkProvider, Closeable {
private var cache = mapOf<String, Jwk>()
private val cacheUpdaterJob = timer(
name = "jwks-cache-updater",
daemon = true,
period = expiration.toMillis()
) {
val actual = delegate.all.associateBy { it.id }
cache = actual
}
override fun get(keyId: String): Jwk {
return cache[keyId] ?: throw SigningKeyNotFoundException("No key found with kid $keyId", null)
}
override fun close() {
cacheUpdaterJob.cancel()
}
}
fun main() {
val jwksUrl = "https://login.microsoftonline.com/common/discovery/keys".let {
URI(it).normalize().toURL()
}
val urlJwkProvider = UrlJwkProvider(jwksUrl)
val foundKey = urlJwkProvider.all.first().id
val notFoundKey = UUID.randomUUID().toString()
val jwkProvider = CachedJwkProvider(urlJwkProvider, Duration.ofMinutes(10))
/*
waiting cache to load. It's up to implementation make the "get()" call blocking or not. But I
prefer do not wait any 3party system, therefore we need to wait here just for test
*/
while (runCatching { jwkProvider.get(foundKey) }.isFailure)
/*
someone requesting server with JWT, but which has invalid kid
*/
repeat(1000) {
assertThrows<SigningKeyNotFoundException> {
jwkProvider.get(notFoundKey)
}
}
/*
someone would like to request the server with valid JWT and will NOT get rate limit error
*/
jwkProvider.get(foundKey)
}
Thanks @scrat98 for the feedback and proposed alternatives! Regarding the rate-limiting, you can choose to not enable rate limiting by not configuring it, correct? Regarding the caching behavior, that's something we should look into - it sounds familiar, I'll look into if this is something we looked into in the past and if there was any findings regarding the cache behaving that way.
Hi @scrat98,
Thanks for raising this issue and for the detailed explanation and example.
We've made a change to the UrlJwkProvider to optimize how JWKS are fetched and cached in latest release:
We now cache the entire JWKS response after the first successful fetch.
On subsequent key lookups, If the requested kid is found in the cache, it's returned directly. If it's not found, the provider will refresh the JWKS once and attempt the lookup again. Only after a second miss will a SigningKeyNotFoundException be thrown.
Please Refer - configure rate limits and configure network timeout settings
Thank you
@tanya732 great, thank you. I'll close this issue then. Unfortunately, I won't have time to run the code example above to validate the changes, but it looks good