[FLINK-28751][Table SQL/Runtime] Optimize the performance of the json…
… functions
What is the purpose of the change
This PR is meant to improve the performance of the built in json functions. The default LRUCache used in the JsonPath CacheProvider heavily use the lock which bring the bad performance.
Brief change log
- Create a new
JsonPathCacheand set it to theCacheProviderwhen load the class
Verifying this change
The functionality is covered by the existing tests. I have not written a performance test for it, If needed, I will add one. I manually test the case with the production job, which will have 2~4 times performance improvement.
Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with
@Public(Evolving): (no) - The serializers: (no)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
- The S3 file system connector: (no)
Documentation
- Does this pull request introduce a new feature? (no)
CI report:
- 625c09a891fb6e04c97058690d3f40cfecdc9d7e Azure: SUCCESS
Bot commands
The @flinkbot bot supports the following commands:@flinkbot run azurere-run the last Azure build
Do you have any benchmark between the two implementations?
I have not written a benchmark test for it now, I only run a production test which seen 2~4 times improved. Do you mean add an extra test for it in the flink-benchmark project? @wuchong
@wuchong I wrote a test with
- 4 threads
- 400 cache item
The result shows below:
Benchmark Mode Cnt Score Error Units
GuavaCacheBenchmark.get thrpt 30 4480.563 ± 203.311 ops/ms
GuavaCacheBenchmark.put thrpt 30 1774.769 ± 119.198 ops/ms
LRUCacheBenchmark.get thrpt 30 441.239 ± 2.812 ops/ms
LRUCacheBenchmark.put thrpt 30 350.549 ± 12.285 ops/ms
The results show that the new cache will bring 5~10 times throughputs
ping @wuchong
@wuchong any further comments ?
@flinkbot run azure
Rebase and force pushed