Fix synchronization for JWT increasing performance for token validation
Motivation:
The observation was that reusing objects like Signature, Cipher and Mac and running it within synchronization blocks has a big impact on application performance. Used in a handler it serializes all eventloop threads and potentially introduces a lot of wait for all other threads.
Using the Signature.getInstance(String algorithm) does not add a lot of overhead and is then safe to use within the same context for the operation.
Doing some performance tests on EC256 signatures got me nearly the same performance multiplicator as CPU cores available.
I'm wondering if we could not store those in a thread local when the thread is a vertx thread (excluding virtual threads) otherwise create it as we go like you are doing
Even calling getInstance on the same Thread will get you different instances of the object. So the idea is, not to re-use them. So i am not sure if we would win a lot with that except getting it off the eventloop thread.
This might be done best on the authentication handler level i guess?
If your thought was in a different direction, let me know
I think that what @vietj suggests is to create a something like ThreadLocalRandom for the Ciper or Signature instance, that would use a FastThreadLocal storage for reuse of cipher and signature when running on an event loop thread, otherwise creating objects on demand.
If you have time to do this it would be great.
Understood but the Random use-case is a different one as this "should" be shared by definition, sharing the Signature object does not really add a lot of benefits from what i read so far.
I can do some further tests if it adds any performance gain.
Did conduct some tests and the results are:
Using FastThreadLocal on a 12 core system had around 52ms versus 227ms with a plain getInstance for Signature
and 36ms over 2651ms for Cipher (20 million cycles)
I could not find a vert.x way of checking for VertxThread and/or isVirtual as the impl package is no longer accessible in vert.x 5.x with modules so i used FastThreadLocalThread.currentThreadHasFastThreadLocal() from netty for now
have a look at my last commit if this is in the direction or your idea and let me know
@zyclonite sorry for the late reply. I did an update for JWE in https://github.com/zyclonite/vertx-auth/pull/1 that you can replicate in JWK and JWS.
took me a while to find some time...
did a quick apply of your approach on the JWS impl as well but this approach would only work if the algorithm would always be the same for the same thread... that's the reason why the test fails now
@zyclonite I wanted to create a pull request to your repo with https://github.com/eclipse-vertx/vertx-auth/pull/726/commits/c7d5d59e9d4adc281e658800e7c6c6dabe1de274 but incidentally pushed directly to your branch.
Anyway, that should fix the problem you mentioned in your previous comment.
Do we have some evidence (e.g. microbenchmarks) that creating signatures and ciphers on the fly is more expensive than reusing them?
@tsegismont looks great, thanks for that solution.
I fear your example has the same issue -> JWE
I can fix it there as well but i found out that the JWE implementation does actually not work properly, is nearly not implemented at all and does not have tests. I tried to think ahead and start implementing at least the recommended things but this starts to break with the JWK approach as it gets very complicated and would require a lot of refactoring.
I fear your example has the same issue ->
JWE
I'm not sure what you mean
initially i was referring to your commit https://github.com/eclipse-vertx/vertx-auth/pull/726/commits/412682563165e06a8ee34ae302b083f9ee2f4cdc , that needs the same change with a selector on the algorithm, there is just no test that could fail in the same way.
the other statement was to the JWE implementation itself. not only that no test exist, it could have never worked based on that line https://github.com/eclipse-vertx/vertx-auth/blob/76717f72da6977e14394ef4ca67dfca47d430f31/vertx-auth-common/src/main/java/io/vertx/ext/auth/impl/jose/JWE.java#L38 (only if use=sig and that makes no sense)
so this class needs a proper implementation first, then tests but in a different PR.
i looked a bit into it but this will require a lot of refactoring of the JWK implementation as well...
@zyclonite thanks for clarifying. I think our immediate goal must be to fix the performance issue where the implementation is trusted and tested.
the other statement was to the JWE implementation itself. not only that no test exist, it could have never worked based on that line
@pmlopes any comment on this?
@zyclonite is there an opportunity for you to compare the performance of single-use ciphers/signatures vs fast thread local caching when running on a FastThreadLocalThread? In this case, could you please report the results?
sorry, i lost a bit track of that issue and did not find time to do some extended testing but i am pretty sure it adds performance
shall i discard that PR as @vietj has started refactoring the whole crypto part of the implementation?
sorry, i lost a bit track of that issue and did not find time to do some extended testing but i am pretty sure it adds performance
shall i discard that PR as @vietj has started refactoring the whole crypto part of the implementation?
yes, you can, this code will be moved to vertx-core, you can help review the other PR
later we will introduce fast thread local in the code moved to vertx-core that will be easier to implement