rax icon indicating copy to clipboard operation
rax copied to clipboard

Add state to the iterator to avoid re-scans for children

Open michaelplaing opened this issue 2 years ago • 3 comments

This PR:

  • Adds a byte stack to raxIterator that contains child offsets.
  • Adds a separate, more complicated and slightly slower raxLowWalkSeek() for use by raxSeek() - this preserves the speed of the simpler original raxLowWalk() which is heavily used by raxFind() etc.
  • Modifies the raxSeek(), raxNext() and raxPrior() functions to use the above.
  • Also incorporates a substantive simplification made to raxSeek() taken from Redis 7.0.

The default rax-tests pass, including the fuzz tests.

michaelplaing avatar May 01 '22 15:05 michaelplaing

I merged this in a branch of my fork for testing: https://github.com/jcorporation/rax/tree/michaelplaing-iter_scan_upgrade

I have run the benchmark on master (without this patch) and your patch merged (branch michaelplaing-iter_scan_upgrade) and could see only minor timing changes. Do I miss anything?

jcorporation avatar May 05 '22 09:05 jcorporation

There are only minor improvements on the original default tests as they have a pretty sparse keyspaces. Since the spans in each node therefore are short, there is not much room to improve. Tests using binary data can show much more improvement, depending on density. timestamps might be a good example, if they are issued close together. Redis uses millions of them and they are iterated in the event loop. Sequential IDs might be another usage.

michaelplaing avatar May 05 '22 18:05 michaelplaing

How large must be the keyspace to see improvements? A benchmark for illustration would be good.

jcorporation avatar May 15 '22 15:05 jcorporation