core-rs-albatross
core-rs-albatross copied to clipboard
Improve performance of TrieNode serialization
Improve performance of TrieNode
serialization.
Pull request checklist
- [X] All tests pass. The project builds and runs.
- [X] I have resolved any merge conflicts.
- [X] I have resolved all
clippy
andrustfmt
warnings.
Looking at it.
I did some changes to postcard(!) and the way we serialize hashes (thanks for the nice benchmarks btw), the result is as follows:
empty/serde time: [20.025 ns 20.300 ns 20.662 ns]
empty/plain time: [18.265 ns 18.381 ns 18.567 ns]
root/serde time: [34.466 ns 34.592 ns 34.742 ns]
root/plain time: [29.107 ns 29.238 ns 29.394 ns]
root_incomplete/serde time: [36.553 ns 36.746 ns 36.965 ns]
root_incomplete/plain time: [34.310 ns 34.487 ns 34.695 ns]
leaf32/serde time: [25.873 ns 25.951 ns 26.050 ns]
leaf32/plain time: [23.987 ns 24.149 ns 24.346 ns]
leaf128/serde time: [28.838 ns 28.896 ns 28.977 ns]
leaf128/plain time: [24.703 ns 24.794 ns 24.903 ns]
leaf1024/serde time: [35.356 ns 35.532 ns 35.784 ns]
leaf1024/plain time: [31.009 ns 31.204 ns 31.406 ns]
leaf16834/serde time: [214.04 ns 214.85 ns 215.63 ns]
leaf16834/plain time: [191.85 ns 192.91 ns 194.14 ns]
leaf32768/serde time: [680.77 ns 683.33 ns 686.13 ns]
leaf32768/plain time: [641.49 ns 644.56 ns 647.63 ns]
leaf131072/serde time: [2.9458 µs 2.9569 µs 2.9696 µs]
leaf131072/plain time: [2.9047 µs 2.9149 µs 2.9276 µs]
The results show that the tweaked serde/postcard serialization is consistently a bit slower than the handrolled one, but always within ~10%.
Given the improved performance of the tweaked serde/postcard serialization, I think not hand-rolling our own serialization is better because it's introducing less complexity.
The way forward for this would be me pull-requesting my changes to postcard. Since we probably want to have that either way, I'm going to prepare the pull request for postcard.