rocksdb
rocksdb copied to clipboard
JNI get_helper code sharing / multiGet() use efficient batch C++ support
Implement RAII-based helpers for JNIGet()
and multiGet()
Replace JNI C++ helpers rocksdb_get_helper
, rocksdb_get_helper_direct
, multi_get_helper
, multi_get_helper_direct
, multi_get_helper_release_keys
, txn_get_helper
, txn_multi_get_helper
The model is to entirely do away with a single helper, instead a number of utility methods allow each separate
JNI Get()
and MultiGet()
method to organise their parameters efficiently, then call the underlying C++ db->Get()
,
db->MultiGet()
, txn->Get()
or txn->MultiGet()
method itself, and
use further utilities to retrieve results.
Roughly speaking:
- get keys into C++ form
- Call C++ Get()
- get results and status into Java form
We achieve a useful performance gain as part of this work; by using the updated C++ multiGet
we immediately pick up its performance gains (batch improvements to multiGet C++ were previously implemented, but not until now used by Java/JNI). multiGetBB
already uses the batched C++
multiGet()
, and all other benchmarks show consistent improvement after the changes:
Before this PR
Benchmark (columnFamilyTestType) (keyCount) (keySize) (multiGetSize) (valueSize) Mode Cnt Score Error Units MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 256 thrpt 25 5315.459 ± 20.465 ops/s MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 1024 thrpt 25 5673.115 ± 78.299 ops/s MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 4096 thrpt 25 2616.860 ± 46.994 ops/s MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 16384 thrpt 25 1700.058 ± 24.034 ops/s MultiGetNewBenchmarks.multiGetBB200 no_column_family 10000 1024 100 65536 thrpt 25 791.171 ± 13.955 ops/s MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 256 thrpt 25 6129.929 ± 94.200 ops/s MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 1024 thrpt 25 7012.405 ± 97.886 ops/s MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 4096 thrpt 25 2799.014 ± 39.352 ops/s MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 16384 thrpt 25 1417.205 ± 22.272 ops/s MultiGetNewBenchmarks.multiGetList10 no_column_family 10000 1024 100 65536 thrpt 25 655.594 ± 13.050 ops/s MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 256 thrpt 25 6147.247 ± 82.711 ops/s MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 1024 thrpt 25 7004.213 ± 79.251 ops/s MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 4096 thrpt 25 2715.154 ± 110.017 ops/s MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 16384 thrpt 25 1408.070 ± 31.714 ops/s MultiGetNewBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 65536 thrpt 25 623.829 ± 57.374 ops/s MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 256 thrpt 25 6119.243 ± 116.313 ops/s MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 1024 thrpt 25 6931.873 ± 128.094 ops/s MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 4096 thrpt 25 2678.253 ± 39.113 ops/s MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 16384 thrpt 25 1337.384 ± 19.500 ops/s MultiGetNewBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 65536 thrpt 25 625.596 ± 14.525 ops/s
After this PR
Benchmark (columnFamilyTestType) (keyCount) (keySize) (multiGetSize) (valueSize) Mode Cnt Score Error Units MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 256 thrpt 25 5320.496 ± 24.702 ops/s MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 1024 thrpt 25 5656.477 ± 54.518 ops/s MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 4096 thrpt 25 2562.934 ± 19.211 ops/s MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 16384 thrpt 25 1677.339 ± 13.946 ops/s MultiGetBenchmarks.multiGetBB200 no_column_family 10000 1024 100 65536 thrpt 25 763.773 ± 16.721 ops/s MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 256 thrpt 25 6988.732 ± 53.948 ops/s MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 1024 thrpt 25 7531.901 ± 76.555 ops/s MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 4096 thrpt 25 2906.175 ± 35.923 ops/s MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 16384 thrpt 25 1590.764 ± 27.832 ops/s MultiGetBenchmarks.multiGetList10 no_column_family 10000 1024 100 65536 thrpt 25 743.695 ± 10.272 ops/s MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 256 thrpt 25 6883.862 ± 46.907 ops/s MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 1024 thrpt 25 7579.593 ± 96.047 ops/s MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 4096 thrpt 25 2920.348 ± 43.066 ops/s MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 16384 thrpt 25 1583.203 ± 35.926 ops/s MultiGetBenchmarks.multiGetListExplicitCF20 no_column_family 10000 1024 100 65536 thrpt 25 744.396 ± 7.844 ops/s MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 256 thrpt 25 6975.793 ± 23.194 ops/s MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 1024 thrpt 25 7600.065 ± 46.056 ops/s MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 4096 thrpt 25 2939.329 ± 41.117 ops/s MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 16384 thrpt 25 1606.477 ± 8.143 ops/s MultiGetBenchmarks.multiGetListRandomCF30 no_column_family 10000 1024 100 65536 thrpt 25 744.661 ± 9.443 ops/s
It would be good to extend this work to other helpers (other than get()), but if we did so at this point the scope would be far too great.
This PR closes https://github.com/facebook/rocksdb/issues/11518
This PR is being replaced by https://github.com/facebook/rocksdb/pull/12344 - I will delete this when that one has been completed.
I wonder if @ajkr or @pdillinger could help me here ? When I call the optimized version of db->MultiGet()
with verify_checksums
set (https://github.com/facebook/rocksdb/blob/main/include/rocksdb/db.h#L691) it appears that BLOCK_CHECKSUM_COMPUTE_COUNT
is not updated, while if I call (https://github.com/facebook/rocksdb/blob/main/include/rocksdb/db.h#L662) BLOCK_CHECKSUM_COMPUTE_COUNT
is not updated.
I have been using the changed BLOCK_CHECKSUM_COMPUTE_COUNT
to confirm that the RocksJNI version of the flag is respected. My question is whether RocksDB is still behaving correctly (verifying the checksum), and the issue is that I shouldn't rely on the checksum count to check this ?