jjwt Optimization: Cache header

Initial tests show that caching the header could improve the performance by about 10%.

So this would be possible (as far as I can see), if keys returned by SigningKeyResolver could either be validated on use, or the cache cleared when keys are no longer valid.

Feb 24 '21 23:02 skjolber

Hey @skjolber,

I'm not sure I'm following completely, can you point to where you think the keys should be cached?

What implementation of SigningKeyResolver are you using? It should be possible to create an implementation that caches your key.

For example, I have a simplistic resolver that caches keys here: https://github.com/okta/okta-jwt-verifier-java/blob/master/impl/src/main/java/com/okta/jwt/impl/jjwt/RemoteJwkSigningKeyResolver.java

Feb 25 '21 15:02 bdemers

The whole JWT header (as a String) should be mapped to a cache entry containing

Header
SignatureAlgorithm
Key
CompressionCodec

Also, this means that the SigningKeyResolver only looks at the header, i.e. not the body, when determining the appropriate key.

Feb 25 '21 19:02 skjolber

So something like this.

Feb 25 '21 19:02 skjolber

OH, you want to cache the actual base64 encoded header in order to optimize the dserialization of the header?

So in the case of a JWT:

eyJ0eXAiOiJKV1QiLCJraWQiOiIxMjMiLCJhbGciOiJSUzUxMiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWUsImlhdCI6MTYxNDI5NTc4OSwiZXhwIjoxNjE0Mjk5Mzg5fQ.uG0HtgSKJQsXGMfEdWKMKOK3aY0oO1rURzgnMH6pPTsshdWaX3RYWyXLxjBHXkDETv78xWndFZbTd71ePjm4UoTF3hGYemFHVKcJt4cWCvfItQ86PNM0TwoX72QDrb36nMNpwxX3D6P08qLDfRCtJSnRo1k0U6gVnPs2CQv82SYy32iTqDc7xIumntfruBhfwXFRebanzjHBSRSat9B-1e6ReNdP7HybioDlWRMoQ0ypaAA4BkGwF0ChqOvI9KtSEiU6htJ2CJAgeRO7IeVgYg0K3ADVjI88Us4MUr7ZySFY7YE2LZgmxFhiHaDeWg3Z_lrRobpvRaF-11duNRODJw

You would use the header eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzUxMiJ9 as the cache key, and the value of said entry would be an object wrapping the decoded header value in my case a JwsHeader containing:

{
  "typ": "JWT",
  "kid": "123",
  "alg": "RS512"
}

(NOTE: this could be a JweHeader in the future)

This should be possible, we would need a cache interface, that likely defaults to an implementation that uses something like a SoftHashMap.

@skjolber any idea on the average gain from something like this? (or what percentage of the benchmark is spent dealing with the header)?

Feb 25 '21 23:02 bdemers

Run the benchmark using ./gradlew --stop && ./gradlew clean build jmhClasses jmh --info -x :frameworks:jjwt-bench:spotbugsMain --stacktrace with java 8 for the following output:

Benchmark	Mode	Cnt	Score	Error	Units
JwtVerifyBenchmark.auth0_verify	thrpt	5	28880,019	± 1439,777	ops/s
JwtVerifyBenchmark.fusionauth2_verify	thrpt	5	32855,697	± 1278,328	ops/s
JwtVerifyBenchmark.fusionauth_verify	thrpt	5	32672,284	± 450,467	ops/s
JwtVerifyBenchmark.jjwt_verify	thrpt	5	30357,304	± 875,569	ops/s
JwtVerifyBenchmark.jjwt_verify2	thrpt	5	32772,563	± 686,030	ops/s
JwtVerifyBenchmark.nimbus_verify	thrpt	5	19105,788	± 599,333	ops/s
JwtVerifyBenchmark.okta_verify	thrpt	5	1227,194	± 41,579	ops/s

In a nutshell, performane is 30357 vs 32772, so that is 8 percent, on AMD zen 3.

The header cache might also have a function in throwing away unwanted jwts, if the full set of expected header parameters is known (including the cache key), it should be trivial to create a cache which also is a filter, i.e. for example automatically generate all possible legal header keys on init or when loading remote jwks.

Feb 26 '21 17:02 skjolber

With key rotation for remote jwks, the list all of all valid kid isn't known ahead of time, so could not automatically prime the cache. I think for many use-cases the number of different headers an application sees is probably minimal, so caching after decoding should be enough to see the benefit.

Feb 26 '21 21:02 bdemers

It is unclear to me how valuable this would actually be in real-world applications, as opposed to a benchmark suite. (I'm not saying it is not valuable - just that I honestly don't know if it is or not).

Things like exp and jti fields are very common which means you still have to deserialize the header into JSON in order to assert things like expiration and other restrictions (nbf, etc). Most apps in practice should be using at least the exp header, so that implies deserialization in the large majority of cases anyway.

Does this make sense? Or might I be missing something (and I very well could be).

Feb 26 '21 22:02 lhazlewood

@bdemers Remote keys must be downloaded before evaulating tokens, so the set of possible headers will be known in time (at runtime), but lets save the 'permutation' approach for later - you're right the simpler caching after decoding should be good enough.

@lhazlewood Yes, but those fields are in the JWT body. The JWT header is usually data which almost static, like signature algorithm, body compression algorithm and key id, so can stay the same for days or months at a time.

Feb 27 '21 00:02 skjolber

Hey @skjolber, The list of jwks keys is NOT always known ahead of time, this is an important detail when it comes to key rotation and removing any exposed keys. The initial set of keys would be known, and these could be cached, but cache misses would be expected once the IdP starts issuing tokens with a newer key.

It's also worth mentioning, the caching strategy would also be slightly coupled to the SigningKeyResolver, (at least in the case of a JWKS endpoint implementation, invalid kid values would need to be removed from the cache to ensure future tokens do NOT use keys that have been removed or rotated out).

I'm also not as familiar with the list of JWE headers, so I don't know if this would be a JWS only optimization or not, @lhazlewood probably has more insight on this.

Mar 01 '21 16:03 bdemers

@bdemers I agree. The remote JWKs should be kept up-to-date; refreshes triggered either by time or by unknown keys. JWKs no longer on the list should be banned. Thus changes to the JWKs must affect the cache.

In general keeping the remote JWKs up-to-date in an optimal way is somewhat complicated - here is my take on it (implementation).

What I was trying to communicate was that once the JWKs are refreshed, before a JWT signature is evaluated, the key ids are known.

In my eyes, JWTs with unknown key ids is a trivial denial-of-service vector, so I've implemented a trivial rate limiter. Basically all our kubernetes JVMs could be asking the same authorization server for JWKs, which is not good - probably it will itself start dropping requests or take a long time to respond. So also we just do one request at a time per JVM; however all request with unknown key ids have to be kept waiting (in-flight), which is also negative. So the rate-limiter will result in some legal requests being dropped, if someone is making a lot of fake requests while we rotate the keys, which is acceptable for us.

Mar 01 '21 23:03 skjolber

@skjolber I know it has been a while, but I am still interested in this, it was just a lower priority than the JWE stuff we released recently.

That said, the newest parser implementation uses more efficient I/O (streams, buffers, etc) and @bdemers noticed a decent improvement in performance, sometimes more than 15-20% if I remember correctly.

Any interest in trying the benchmark again on the latest release?

Jan 29 '24 23:01 lhazlewood

@lhazlewood Sure, however are you committed to actually taking in potentially painful changes for performance improvements? My (very simple) previous PR has been pending for years, indicating that your are not.

Jan 30 '24 12:01 skjolber

@skjolber that's a fair critique and concern, and I appreciate what you're saying.

However, that PR still remains open because it's not clear what the JDK 8 changes to the JJWT APIs for 1.0 will look like and I was going to evaluate it in the context of those changes. In other words, my https://github.com/jwtk/jjwt/pull/651#issuecomment-789249066 still applies.

That said, I was more curious about what the 0.12.0 performance enhancements look like in the benchmark suite, as well as how the JWT header caching might still benefit, so I'll try to run it locally first before bothering you any further. I definitely don't want to distract you until we're actively ready to prioritize this relative to the other 1.0 work.

Thanks for checking in!

Jan 30 '24 21:01 lhazlewood

@lhazlewood so have you considered adding a benchmarking (JMH) module to the project itself? My impression is that this library supports more features than the other libs I have seen (i.e. the ones used in java-jwt-benchmark) and so a wider (handful?) selection of performance tests could be useful.

Jan 30 '24 23:01 skjolber

jjwt jjwt copied to clipboard

Optimization: Cache header

jjwt
jjwt copied to clipboard