Implement DB benchmark
This PR implements a benchmark for DB usage.
It independently measures read and write performance of every DB imnplementation. Allows to make informed decisions for various flows of working with data, keeping measurement of pure DB performance separate from performance of other subsystems (including serialization).
See db_benchmark/README.md for more details about the benchmark.
Explain how you tested your changes:
- Executed the benchmark successfully
Checklist:
- [x] Dependency versions are unchanged
- Notify Velocity team if dependencies must change in CI
- [x] Modified the current draft of release notes with details on what is completed or incomplete within this project
- [x] Document code purpose, how to use it
- Mention expected invariants, implicit constraints
- [x] Tests were added for the new behavior
- Document test purpose, significance of failures
- Test names should reflect their purpose
- [x] All tests pass (CI will check this if you didn't)
- [x] Serialized types are in stable-versioned modules
- [x] Does this close issues? None
Result of the run with default parameters
| Name | Time/Run | mWd/Run | mjWd/Run | Prom/Run | Percentage |
|---|---|---|---|---|---|
| rocksdb_write | 705_955.63us | 158_922.00w | 482.99w | 482.99w | 26.18% |
| rocksdb_read | 125.97us | 1_935.00w | 16_399.56w | 13.56w | Â |
| lmdb_write | 2_696_045.27us | 10_047.00w | 12.96w | 12.96w | 100.00% |
| lmdb_read | 96.56us | 1_217.00w | 16_387.19w | 1.19w | Â |
| single_file_write | 70_465.27us | 19_047.38w | 324.69w | 324.69w | 2.61% |
| single_file_read | 362.82us | 1_273.00w | 73_738.82w | 2.82w | 0.01% |
| multi_file_write | 83_925.40us | 559.00w | 2_048_384.01w | 382.01w | 3.11% |
| multi_file_read | 184.54us | 1_269.00w | 32_772.38w | 0.38w | Â |
ð Benchmark Analysis: Write vs Read Performance
Test Configuration:
- ðĶ Keys per block: 125
- ðū Value size: 131,072 bytes (128 KB)
- ðĒ Blocks in DB: 800
- Total data: ~100,000 keys, ~12.8 GB total
âïļ Write Performance Comparison
Speed (Time/Run - lower is better):
- ðĨ single_file_write: ~70ms - fastest option
- ðĨ multi_file_write: ~84ms - very close second
- â ïļ rocksdb_write: ~706ms - 10x slower than single file
- ðī lmdb_write: ~2,696ms - significantly slower (38x slower than single file)
Memory Allocation:
- ð multi_file_write: 559w - minimal minor heap allocation
- â ïļ rocksdb_write: 158,922w - high memory allocation
- ðī multi_file_write (mjWd): 2,048,384w - very high major heap pressure from file operations
ð Read Performance Comparison
Speed (Time/Run - all very fast):
- ðĨ lmdb_read: ~97Ξs - fastest
- ðĨ rocksdb_read: ~126Ξs - nearly identical
- â multi_file_read: ~185Ξs - still excellent
- â single_file_read: ~363Ξs - slowest but still sub-millisecond
All read operations are extremely fast (microsecond range vs millisecond writes).
ðŊ Key Takeaways
â
LMDB: Terrible write performance (~2.7s per operation) but excellent read speed
â
Simple file I/O: Best write performance by far - ideal for large value storage
â
RocksDB: Balanced middle-ground but high memory usage on writes
â
Large values (128 KB): Simple file approaches dominate for write throughput
ðĄ Recommendation: For large-value workloads like this (128 KB per value):
- Write-heavy â single_file_write is the clear winner
- Read-heavy â LMDB or RocksDB provide faster lookups
Update after optimization of multi-file writing
ð multi_file_write Benchmark Results
| Name | Time/Run | mWd/Run | mjWd/Run | Prom/Run | Percentage |
|---|---|---|---|---|---|
| multi_file_write | 45.50ms | 553.00w | 24.03w | 24.03w | 100.00% |
ð Before vs After Comparison
Performance Gains:
- âąïļ Time: 83,925Ξs â 45,500Ξs (45.5ms)
- ð Speedup: 1.84x faster ð
- ðū Memory (mWd): 559w â 553w (essentially unchanged)
- ð Major heap (mjWd): 2,048,384w â 24.03w
- ⥠Major heap reduction: 99.999% reduction! ðĨ
ð Updated Write Performance Rankings
- ðĨ multi_file_write (new): ~45.5ms - NEW CHAMPION
- ðĨ single_file_write: ~70ms (1.54x slower)
- â ïļ rocksdb_write: ~706ms (15.5x slower)
- ðī lmdb_write: ~2,696ms (59x slower)
ðĄ What Changed?
The massive mjWd reduction (from 2M+ to 24w) suggests you eliminated file system churn or excessive allocations. This is a textbook example of optimization - you kept the speed advantage while making it vastly more GC-friendly.
New recommendation: For large-value (128 KB) write workloads, multi_file_write is now the clear winner - fastest write speed AND minimal heap pressure. ðŊ
ð Smaller Values, Different Story
ð Full Benchmark Results (New Parameters)
Test Configuration:
- ðĶ Keys per block: 32
- ðū Value size: 9,000 bytes (8.8 KB)
- ðĒ Warmup blocks: 1,000
- Total warmup: 32,000 keys
| Name | Time/Run | mWd/Run | mjWd/Run | Prom/Run | Percentage |
|---|---|---|---|---|---|
| rocksdb_write | 4,033.82us | 40,708.00w | 20.57w | 20.57w | 0.42% |
| rocksdb_read | 40.49us | 1,924.00w | 1,140.63w | 13.63w | - |
| lmdb_write | 957,662.23us | 2,596.00w | 0.65w | 0.65w | 100.00% |
| lmdb_read | 35.66us | 1,206.00w | 1,127.31w | 0.31w | - |
| single_file_write | 1,957.77us | 4,900.00w | 38.59w | 38.59w | 0.20% |
| single_file_read | 70.29us | 1,262.00w | 9,321.10w | 0.10w | - |
| multi_file_write | 501.43us | 269.00w | 36,014.88w | 12.88w | 0.05% |
| multi_file_read | 55.97us | 1,258.00w | 2,254.03w | - | - |
Before vs After:
- âąïļ Time: 501.43us â 839.77us (1.67x slower) â ïļ
- ðū Memory (mWd): 269w â 274w (essentially unchanged)
- ð Major heap (mjWd): 36,014.88w â 6.49w
- ⥠Major heap reduction: 99.98% reduction! ðĨ
ð Write Performance Rankings (8.8 KB values)
- ðĨ multi_file_write (original): ~501us - fastest
- ðĨ multi_file_write (optimized): ~840us - better GC behavior
- ðĨ single_file_write: ~1,958us
- â ïļ rocksdb_write: ~4,034us
- ðī lmdb_write: ~957,662us - still very slow
ð Read Performance Rankings
- ðĨ lmdb_read: ~36us - fastest
- ðĨ rocksdb_read: ~40us
- â multi_file_read: ~56us
- â single_file_read: ~70us
ðĄ Key Observations
Compared to 128 KB value test:
- ð All operations are significantly faster with smaller values (8.8 KB vs 128 KB)
- ð Trade-off emerged: Optimization reduced mjWd by 99.98% but slowed writes by 1.67x
- ðŊ RocksDB becomes competitive at smaller value sizes (~4ms vs 706ms previously)
- ⥠LMDB still struggles with writes but dominates reads
Optimization trade-off: The optimized version trades some speed for much better GC behavior. Depending on workload (GC pressure vs raw throughput), either version could be preferable.