yasson
yasson copied to clipboard
POC: Improve deserialization performance
I'm currently rewritting deserialization to improve it's performance.
Tests (ops/ms):
- NewDeserializerTest: deserialization of flat map with 50 elements (mix of JSON string, number, true, false and null)
- NewArrayDeserializerTest: deserialization of flat array with 100 elements
- NewPoJoDeserializerTest: Deserialization of simple POJO with strings, numbers and boolean values
- NewComplexPoJoStructureDeserializerTest: Deserialization of nested object structure
+------------------+-----+-----+-----+-----+
| Test | T1 | T2 | T3 | T4 |
+------------------+-----+-----+-----+-----+
| Old deserializer | 60 | 72 | 369 | 184 |
+------------------+-----+-----+-----+-----+
| 25th Nov 2019 | 114 | 125 | 466 | 293 |
+------------------+-----+-----+-----+-----+
| 28th Nov 2019 | 112 | 123 | 428 | 272 |
+------------------+-----+-----+-----+-----+
| 22th Jan 2020 | 106 | 136 | 395 | 254 |
+------------------+-----+-----+-----+-----+
| 5th Mar 2020 | 108 | 110 | 391 | 237 |
+------------------+-----+-----+-----+-----+
@aguibert @Verdent @m0mus @mkarg What do you guys think about this? :)
PR to check: https://github.com/Tomas-Kraus/yasson/pull/1
@Tomas-Kraus I always apprciate 20-80% performance gain. As I am not a contributor to Yasson, I leave the technical evaluation to those who are. :-)
great news @Tomas-Kraus, looking forward to evaluating it. It is kind of hard to view what the proposed changes are because the PR https://github.com/Tomas-Kraus/yasson/pull/1/files includes many commits that we have already merged into master and are unrelated to the POC.
Can you please rebase what you have on the current yasson/master branch and make a "draft PR" (newish github feature) on the Yasson repo? It will be easier to view that way and since it is a draft nobody can accidentally merge it until you mark it as a "regular" PR
Looks like git was a bit confused. I did rebase/forcepush so code changes are visible now. Also, this is far from being finished, it's just another concept of deserializers implementation and I want to know whether you like it or not. :)
thanks, now I can review the changes much more easily. I've added a comment on the PR on your personal repo, but I would suggest creating a "Draft PR" on the main Yasson repo so we can potentially have wider participation
If it improves performance and simplifies the codebase, I'm all for it. Main question will be backwards compatibility but I think we can rely on our unit tests and TCK tests here.
Yes, it must pass all jUnits and TCK. :)
According to https://github.com/fabienrenaud/java-json-benchmark, the actual performance of yasson in compare to jackson is not really good. The fastest is https://github.com/ngs-doo/dsl-json which also supports jsonb. Would be nice to improve performance in yasson.
I don't see anywhere on the java-json-benchmark where it actually mentions Yasson. The only related thing it only compares the reference implementation of JSON-P.
I don't see anywhere on the java-json-benchmark where it actually mentions Yasson.
@aguibert yasson is evaluated in this benchmark: look at the chart on the X axis (for example, the first chart ("Deserialization performance") on the 3rd position from the left).
Also look at the benchmarked providers here: https://github.com/fabienrenaud/java-json-benchmark/blob/master/src/main/java/com/github/fabienrenaud/jjb/support/BenchSupport.java on line 18.
You can also run a subset of the performance tests yourself with only 3 json-providers:
./run deser --apis databind --libs jackson,dsljson,yasson
Why does jackson or dsljson has so much better performance in compare to yasson?
Ah, I see Yasson is mentioned on the graph but it doesn't say what version they used. Never heard of dsljson (or anyone using it), but Jackson has been around for many years longer than JSON-B/Yasson and therefore has had more time to optimize performance. Our primary focus has been on adding functionality, rather than performance.
Yasson is still in the 1.0 version, and it doesn't surprise me that Jackson performs better. As you've found on this issue, @Tomas-Kraus is working on improving performance.
In any case, I don't think JSON databinding performance makes that much difference in the overall picture of application performance. I've ran tests in the past that compare a JAX-RS application using JSON-B vs. Jackson, and once you add in network I/O then JSON databind performance becomes negligible. If I were a user trying to evaluate which JSON library to choose, I would look at factors like adoption, compatibility, and stability before I looked at performance.
Ah, I see Yasson is mentioned on the graph but it doesn't say what version they used.
According to https://github.com/fabienrenaud/java-json-benchmark/blob/master/build.gradle it uses v 1.0.1.
If I were a user trying to evaluate which JSON library to choose, I would look at factors like adoption, compatibility, and stability before I looked at performance.
Definitly. Well, performance and memory consumption is still important. If I were a maintainer, I would look especially to jackson or dsljson what techniques they use to have that significantly increased performance in compare to yasson.
I mentioned the java-json-benchmark only FYI:)
In case anyone is interested in up-to-date benchmark figures of Jackson vs Yasson, I've updated the aforementioned java-json-benchmark via a fork. This uses the latest released versions of both frameworks as of today, all evaluated on Amazon Corretto JDK 11.0.5. You can find the results in the README at https://github.com/chrisgleissner/java-json-benchmark I also included the command-line invocations I used for easy reproduction of the test.
In a nutshell, Jackson serialization is 2.12 times as fast as Yasson. Deserialization is 3.8 times as fast. And Jackson-Afterburner is even faster.
@chrisgleissner can you run it with original yasson. I would like to see numbers of original and this yasson.
thanks for creating that comparison @chrisgleissner
@Tomas-Kraus you can run it on your own with the following steps:
git clone [email protected]:chrisgleissner/java-json-benchmark.git
cd java-json-benchmark
./run ser --apis databind --libs yasson,jackson --datatype users
To change the version of Yasson used you can edit the build.gradle file in the root of the repository
@aguibert does yasson use java.lang.reflect under the hood? If so, then please consider to use java.lang.invoke.LambdaMetafactory. It's way faster.
Well, looks like I found a dead end for JsonValue processing. :D
Benchmark Mode Cnt Score Error Units
NewJsonValueDeserializationTest.testNew thrpt 5 2.504 ± 0.023 ops/ms
NewJsonValueDeserializationTest.testOld thrpt 5 227.769 ± 14.534 ops/ms
So using Json.createArrayBuilder() and build value is 200 times slower than old code using parser.getArray().
Looks like this part of old deserializer was quite good and I won't improve it much. Using original parser.getArray() / parser.getObject() with slightly faster way to get to it will make just few %.
Last benchmark run also showed me the cost of customization processing on all data types. I've lost few % everywhere.
Current progress: [ERROR] Tests run: 440, Failures: 27, Errors: 58, Skipped: 1
Please note the the maintainer of the Json benchmark project that I cloned for my benchmark runs has now updated his results using Yasson 1.0.6. He also upgraded other frameworks. The benchmark uses JMH and you can find it at https://github.com/fabienrenaud/java-json-benchmark You see Yasson in the 'Users model' diagrams and you can try a different version by changing the Yasson version in the top level build.gradle file.