roaring-rs
roaring-rs copied to clipboard
Introduce multi operations helper methods
Closes #58 and Fixes #57.
I tried the "naive lazy" alg as suggested by @lemire in #58 and it is indeed faster (and simpler) for OR. It also does better than sequential for XOR.
Unsurprisingly it's worse for SUB and AND.
My tests were all using WIKILEAKS_NOQUOTES_SRT
My experiments are here https://github.com/saik0/roaring-rs/blob/naive-lazy-ops/src/bitmap/multi_ops.rs#L305.
Curiosity got the better of me, so I tied some other strategies.
- Inside the multi-ops there the container access pattern is effectively random. BTreeMap and HashMap keyed by the container key were both improvements. HashMap was especially fast when using the identity hash.
- Copying can be avoided with COW
Unfortunately, you broke the link, next time could you provide a permanent link, please 😄
I think you can open a draft PR on my own branch maybe, this way I can see the changes you made. It seems like you have already done a lot of work there and you PR outsize mine and opening a PR on main/master is maybe better 😂
Note that I am impressed by the work you do, we both clearly want to make it as fast if not faster than the C roaring library! 🚀
Unfortunately, you broke the link, next time could you provide a permanent link, please 😄
Yes. Next time. 😄
I think you can open a draft PR on my own branch maybe, this way I can see the changes you made. It seems like you have already done a lot of work there and you PR outsize mine and opening a PR on main/master is maybe better 😂
The branch includes several slightly different variations of the same alg. It also checks in the sample datasets into the repo for benchmarking. It's hardly merge ready, but it's helpful for evaluating which strategy to choose.
Here's a PR on my own repo: saik0/roaring-rs#1
Note that I am impressed by the work you do
Thank you! I've had COVID for the last week and have been feeling awful and isolated. Hearing that felt pretty good! 😄
make it as fast if not faster than the C roaring library!
Thats a tall order!
The branch includes several slightly different variations of the same alg. It also checks in the sample datasets into the repo for benchmarking. It's hardly merge ready, but it's helpful for evaluating which strategy to choose.
Here's a PR on my own repo: saik0#1
Thank you! I've had COVID for the last week and have been feeling awful and isolated. Hearing that felt pretty good! 😄
I hope you're fine now! I caught it in 2019 too, lost taste and that was hard too. I understand! Just do what you love most and wait until you recover taste 😃
Thats a tall order!
Indeed, let's try to do our best, you already have done a lot with those small PRs!
Maybe you can find the time to publish the benchmarks of your experiments from the https://github.com/saik0/roaring-rs/pull/1 and open a PR on this repo? But maybe you should wait until we merge most of them, would be better to avoid merge conflicts, as you wish!
Here are the benchmarks from my experiments on saik0#1

Wow, the diff with the multi-or is quite impressive, I don't have a comparison with the multi-xor, though! 😃