Paprika
Paprika copied to clipboard
Squeeze more bytes out of slottedarray
WIP
The current plan is to :-
- Add separate method to check for mid length nibblepaths (4,5,6)
- Update
longnibblepaths from >=5 earlier to >=7 now. - Use 3 nibbles (instead of current 4) of the hash for storing the nibbles, and then use the fourth nibble to store the length marker (1,2,3 OR 4,5,6 OR beyond >=7). [effectively reducing 1 nibble from each slot?]
- Update WriteImpl in NibblePath.cs, as currently first byte is used to store (length << 1 | oddity) , instead for length 4,5,6 since we already store the length in the preamble, we can directly store the trimmed nibblepath.
- Update the PrepareKey UnprepareKey functions to add this intermediary case, where it should fetch the trimmed nibblepath differently now:
- No trimmed for <=3,
- Trimmed with length in preamble for 4,5,6
- Longer keys, where 1st byte of trimmed path has length & oddity metadata
And some other details I haven't mentioned here, but I'm not able to coherently write them down today. Only pushing the first commit to start the PR, will push the rest tomorrow, after going over once again and mapping the changes properly.
What the scope of this PR doesn't include: "introducing a special-case to make SlottedArray aware of the most common key at the given level."
Assumptions (need to discuss further)
My assumption for point: "putting the length for keys of length 1-3 as the first nibble in hash, leaving 4+ to be encoded in the preamble"
length 0: special empty length 1,2,3: (2 bits) go in the hash. for preamble, we have 3 bits active ( 000 -> mask for: all nibble path (upto length 3) was in the hash itself. 111 -> mask for: len >= 10
Remaining: 001, 010, 011, 100, 101, 110 -> upto length 9 may be entered directly in the preamble.
So, for length >=10 we store length in the trimmed nibblepath instead of the preamble. And for the lesser ones, we saved 1 byte of space. We already were saving that 1 byte of space for length <= 4 initially, but if we only use 3 nibbles and only store upto length 3 in hash entirely, we save another nibble from the hash in each slot.
Benchmark Comparison :-
New Benchmarks (My Branch):-
| Method | sliceFrom | Mean | Error | StdDev |
|------------------------------------ |---------- |-------------:|-----------:|-----------:|
| Write_whole_page_of_data | ? | 20,775.28 ns | 113.080 ns | 88.285 ns |
| Read_existing_keys_prefix_different | ? | 23,420.96 ns | 150.125 ns | 117.208 ns |
| Read_existing_keys_suffix_different | ? | 23,741.09 ns | 251.329 ns | 209.871 ns |
| Read_nonexistent_keys | ? | 36,463.53 ns | 149.266 ns | 132.321 ns |
| Hash_collisions | ? | 62,296.06 ns | 332.998 ns | 295.194 ns |
| EnumerateAll | ? | 7,139.54 ns | 140.951 ns | 210.969 ns |
| Move_to_keys | ? | 167.02 ns | 3.286 ns | 2.913 ns |
| UnPrepareKey | 0 | 73.97 ns | 0.747 ns | 0.624 ns |
| UnPrepareKey | 1 | 78.35 ns | 0.373 ns | 0.312 ns |
| UnPrepareKey | 62 | 27.60 ns | 0.069 ns | 0.057 ns |
| UnPrepareKey | 63 | 27.90 ns | 0.075 ns | 0.058 ns |
| UnPrepareKey | 64 | 25.98 ns | 0.102 ns | 0.085 ns |
Old Benchmarks (Main Branch):-
| Method | sliceFrom | Mean | Error | StdDev | Median |
|------------------------------------ |---------- |-------------:|-----------:|-----------:|-------------:|
| Write_whole_page_of_data | ? | 20,464.67 ns | 180.766 ns | 150.948 ns | 20,414.13 ns |
| Read_existing_keys_prefix_different | ? | 23,294.49 ns | 459.876 ns | 949.724 ns | 22,764.73 ns |
| Read_existing_keys_suffix_different | ? | 23,996.31 ns | 473.256 ns | 722.712 ns | 23,583.91 ns |
| Read_nonexistent_keys | ? | 34,318.69 ns | 314.131 ns | 245.253 ns | 34,312.31 ns |
| Hash_collisions | ? | 48,483.20 ns | 250.959 ns | 222.469 ns | 48,398.68 ns |
| EnumerateAll | ? | 7,691.10 ns | 143.813 ns | 277.078 ns | 7,651.69 ns |
| Move_to_keys | ? | 158.60 ns | 3.125 ns | 5.473 ns | 157.52 ns |
| UnPrepareKey | 0 | 70.13 ns | 0.231 ns | 0.227 ns | 70.13 ns |
| UnPrepareKey | 1 | 76.52 ns | 1.510 ns | 2.605 ns | 75.33 ns |
| UnPrepareKey | 62 | 26.69 ns | 0.475 ns | 0.583 ns | 26.40 ns |
| UnPrepareKey | 63 | 28.55 ns | 0.573 ns | 1.351 ns | 28.13 ns |
| UnPrepareKey | 64 | 25.27 ns | 0.091 ns | 0.076 ns | 25.25 ns |
Changes Done:
- Add cases for length 5,6
- Update PrepareKey() and UnprepareKey() to support these new types
- Update ReadFrom() functions, to have another function for these cases which takes length and oddity as input (Can just pass in the preamble instead of having an int and a bool parameter)
Next steps:
- Update the WriteImpl and read methods, which set the first byte of trimmed for length and oddity metadata
- Make sure all tests pass, and run benchmarks.
- Work on the smaller keys: Possibly reduce hash size to 3 nibbles, and use preamble for length <= 6, and marker for length >=7
Skimmed through description. Looks promising. Let's go🔥
Latest Benchmarks :-
Main Branch (Original) :-
| Method | sliceFrom | Mean | Error | StdDev | Median |
|---|---|---|---|---|---|
| Write_whole_page_of_data | ? | 14,300.79 ns | 172.493 ns | 378.627 ns | 14,151.02 ns |
| Read_existing_keys_prefix_different | ? | 15,949.02 ns | 229.801 ns | 255.424 ns | 15,849.62 ns |
| Read_existing_keys_suffix_different | ? | 16,024.92 ns | 132.107 ns | 103.141 ns | 16,008.06 ns |
| Read_nonexistent_keys | ? | 24,123.01 ns | 440.956 ns | 344.269 ns | 23,951.74 ns |
| Hash_collisions | ? | 33,914.59 ns | 202.941 ns | 169.465 ns | 33,897.14 ns |
| EnumerateAll | ? | 4,801.45 ns | 36.150 ns | 30.187 ns | 4,798.74 ns |
| Move_to_keys | ? | 115.50 ns | 2.321 ns | 4.125 ns | 115.13 ns |
| UnPrepareKey | 0 | 49.24 ns | 0.288 ns | 0.225 ns | 49.17 ns |
| UnPrepareKey | 1 | 52.50 ns | 0.188 ns | 0.147 ns | 52.51 ns |
| UnPrepareKey | 62 | 18.62 ns | 0.097 ns | 0.086 ns | 18.59 ns |
| UnPrepareKey | 63 | 18.96 ns | 0.021 ns | 0.018 ns | 18.96 ns |
| UnPrepareKey | 64 | 17.88 ns | 0.059 ns | 0.049 ns | 17.87 ns |
This PR (After changes) :-
| Method | sliceFrom | Mean | Error | StdDev |
|---|---|---|---|---|
| Write_whole_page_of_data | ? | 14,737.75 ns | 291.854 ns | 273.000 ns |
| Read_existing_keys_prefix_different | ? | 16,259.73 ns | 72.654 ns | 60.669 ns |
| Read_existing_keys_suffix_different | ? | 16,541.15 ns | 311.198 ns | 345.896 ns |
| Read_nonexistent_keys | ? | 25,829.30 ns | 483.894 ns | 452.635 ns |
| Hash_collisions | ? | 44,030.26 ns | 152.549 ns | 119.100 ns |
| EnumerateAll | ? | 5,307.77 ns | 32.088 ns | 25.052 ns |
| Move_to_keys | ? | 114.29 ns | 2.295 ns | 3.436 ns |
| UnPrepareKey | 0 | 50.57 ns | 1.018 ns | 0.795 ns |
| UnPrepareKey | 1 | 53.86 ns | 0.475 ns | 0.396 ns |
| UnPrepareKey | 62 | 18.47 ns | 0.027 ns | 0.023 ns |
| UnPrepareKey | 63 | 19.26 ns | 0.380 ns | 0.390 ns |
| UnPrepareKey | 64 | 17.92 ns | 0.072 ns | 0.064 ns |
Substituted by KeyEncoding at #375