Paprika icon indicating copy to clipboard operation
Paprika copied to clipboard

Squeeze more bytes out of slottedarray

Open AryanGodara opened this issue 1 year ago • 3 comments
trafficstars

WIP

The current plan is to :-

  1. Add separate method to check for mid length nibblepaths (4,5,6)
  2. Update long nibblepaths from >=5 earlier to >=7 now.
  3. Use 3 nibbles (instead of current 4) of the hash for storing the nibbles, and then use the fourth nibble to store the length marker (1,2,3 OR 4,5,6 OR beyond >=7). [effectively reducing 1 nibble from each slot?]
  4. Update WriteImpl in NibblePath.cs, as currently first byte is used to store (length << 1 | oddity) , instead for length 4,5,6 since we already store the length in the preamble, we can directly store the trimmed nibblepath.
  5. Update the PrepareKey UnprepareKey functions to add this intermediary case, where it should fetch the trimmed nibblepath differently now:
    1. No trimmed for <=3,
    2. Trimmed with length in preamble for 4,5,6
    3. Longer keys, where 1st byte of trimmed path has length & oddity metadata

And some other details I haven't mentioned here, but I'm not able to coherently write them down today. Only pushing the first commit to start the PR, will push the rest tomorrow, after going over once again and mapping the changes properly.

What the scope of this PR doesn't include: "introducing a special-case to make SlottedArray aware of the most common key at the given level."

Assumptions (need to discuss further)

My assumption for point: "putting the length for keys of length 1-3 as the first nibble in hash, leaving 4+ to be encoded in the preamble"

length 0: special empty length 1,2,3: (2 bits) go in the hash. for preamble, we have 3 bits active ( 000 -> mask for: all nibble path (upto length 3) was in the hash itself. 111 -> mask for: len >= 10

Remaining: 001, 010, 011, 100, 101, 110 -> upto length 9 may be entered directly in the preamble.

So, for length >=10 we store length in the trimmed nibblepath instead of the preamble. And for the lesser ones, we saved 1 byte of space. We already were saving that 1 byte of space for length <= 4 initially, but if we only use 3 nibbles and only store upto length 3 in hash entirely, we save another nibble from the hash in each slot.

Benchmark Comparison :-

New Benchmarks (My Branch):-
| Method                              | sliceFrom | Mean         | Error      | StdDev     |
|------------------------------------ |---------- |-------------:|-----------:|-----------:|
| Write_whole_page_of_data            | ?         | 20,775.28 ns | 113.080 ns |  88.285 ns |
| Read_existing_keys_prefix_different | ?         | 23,420.96 ns | 150.125 ns | 117.208 ns |
| Read_existing_keys_suffix_different | ?         | 23,741.09 ns | 251.329 ns | 209.871 ns |
| Read_nonexistent_keys               | ?         | 36,463.53 ns | 149.266 ns | 132.321 ns |
| Hash_collisions                     | ?         | 62,296.06 ns | 332.998 ns | 295.194 ns |
| EnumerateAll                        | ?         |  7,139.54 ns | 140.951 ns | 210.969 ns |
| Move_to_keys                        | ?         |    167.02 ns |   3.286 ns |   2.913 ns |
| UnPrepareKey                        | 0         |     73.97 ns |   0.747 ns |   0.624 ns |
| UnPrepareKey                        | 1         |     78.35 ns |   0.373 ns |   0.312 ns |
| UnPrepareKey                        | 62        |     27.60 ns |   0.069 ns |   0.057 ns |
| UnPrepareKey                        | 63        |     27.90 ns |   0.075 ns |   0.058 ns |
| UnPrepareKey                        | 64        |     25.98 ns |   0.102 ns |   0.085 ns |

Old Benchmarks (Main Branch):-
| Method                              | sliceFrom | Mean         | Error      | StdDev     | Median       |
|------------------------------------ |---------- |-------------:|-----------:|-----------:|-------------:|
| Write_whole_page_of_data            | ?         | 20,464.67 ns | 180.766 ns | 150.948 ns | 20,414.13 ns |
| Read_existing_keys_prefix_different | ?         | 23,294.49 ns | 459.876 ns | 949.724 ns | 22,764.73 ns |
| Read_existing_keys_suffix_different | ?         | 23,996.31 ns | 473.256 ns | 722.712 ns | 23,583.91 ns |
| Read_nonexistent_keys               | ?         | 34,318.69 ns | 314.131 ns | 245.253 ns | 34,312.31 ns |
| Hash_collisions                     | ?         | 48,483.20 ns | 250.959 ns | 222.469 ns | 48,398.68 ns |
| EnumerateAll                        | ?         |  7,691.10 ns | 143.813 ns | 277.078 ns |  7,651.69 ns |
| Move_to_keys                        | ?         |    158.60 ns |   3.125 ns |   5.473 ns |    157.52 ns |
| UnPrepareKey                        | 0         |     70.13 ns |   0.231 ns |   0.227 ns |     70.13 ns |
| UnPrepareKey                        | 1         |     76.52 ns |   1.510 ns |   2.605 ns |     75.33 ns |
| UnPrepareKey                        | 62        |     26.69 ns |   0.475 ns |   0.583 ns |     26.40 ns |
| UnPrepareKey                        | 63        |     28.55 ns |   0.573 ns |   1.351 ns |     28.13 ns |
| UnPrepareKey                        | 64        |     25.27 ns |   0.091 ns |   0.076 ns |     25.25 ns |

AryanGodara avatar May 30 '24 22:05 AryanGodara

Changes Done:

  1. Add cases for length 5,6
  2. Update PrepareKey() and UnprepareKey() to support these new types
  3. Update ReadFrom() functions, to have another function for these cases which takes length and oddity as input (Can just pass in the preamble instead of having an int and a bool parameter)

Next steps:

  1. Update the WriteImpl and read methods, which set the first byte of trimmed for length and oddity metadata
  2. Make sure all tests pass, and run benchmarks.
  3. Work on the smaller keys: Possibly reduce hash size to 3 nibbles, and use preamble for length <= 6, and marker for length >=7

AryanGodara avatar May 31 '24 07:05 AryanGodara

Skimmed through description. Looks promising. Let's go🔥

Scooletz avatar May 31 '24 10:05 Scooletz

Latest Benchmarks :-

Main Branch (Original) :-

Method sliceFrom Mean Error StdDev Median
Write_whole_page_of_data ? 14,300.79 ns 172.493 ns 378.627 ns 14,151.02 ns
Read_existing_keys_prefix_different ? 15,949.02 ns 229.801 ns 255.424 ns 15,849.62 ns
Read_existing_keys_suffix_different ? 16,024.92 ns 132.107 ns 103.141 ns 16,008.06 ns
Read_nonexistent_keys ? 24,123.01 ns 440.956 ns 344.269 ns 23,951.74 ns
Hash_collisions ? 33,914.59 ns 202.941 ns 169.465 ns 33,897.14 ns
EnumerateAll ? 4,801.45 ns 36.150 ns 30.187 ns 4,798.74 ns
Move_to_keys ? 115.50 ns 2.321 ns 4.125 ns 115.13 ns
UnPrepareKey 0 49.24 ns 0.288 ns 0.225 ns 49.17 ns
UnPrepareKey 1 52.50 ns 0.188 ns 0.147 ns 52.51 ns
UnPrepareKey 62 18.62 ns 0.097 ns 0.086 ns 18.59 ns
UnPrepareKey 63 18.96 ns 0.021 ns 0.018 ns 18.96 ns
UnPrepareKey 64 17.88 ns 0.059 ns 0.049 ns 17.87 ns

This PR (After changes) :-

Method sliceFrom Mean Error StdDev
Write_whole_page_of_data ? 14,737.75 ns 291.854 ns 273.000 ns
Read_existing_keys_prefix_different ? 16,259.73 ns 72.654 ns 60.669 ns
Read_existing_keys_suffix_different ? 16,541.15 ns 311.198 ns 345.896 ns
Read_nonexistent_keys ? 25,829.30 ns 483.894 ns 452.635 ns
Hash_collisions ? 44,030.26 ns 152.549 ns 119.100 ns
EnumerateAll ? 5,307.77 ns 32.088 ns 25.052 ns
Move_to_keys ? 114.29 ns 2.295 ns 3.436 ns
UnPrepareKey 0 50.57 ns 1.018 ns 0.795 ns
UnPrepareKey 1 53.86 ns 0.475 ns 0.396 ns
UnPrepareKey 62 18.47 ns 0.027 ns 0.023 ns
UnPrepareKey 63 19.26 ns 0.380 ns 0.390 ns
UnPrepareKey 64 17.92 ns 0.072 ns 0.064 ns

AryanGodara avatar Jun 06 '24 18:06 AryanGodara

Substituted by KeyEncoding at #375

Scooletz avatar Sep 04 '24 12:09 Scooletz