roaring-rs
roaring-rs copied to clipboard
Optimization(?): Clone-on-write containers
Cow would allow optimizations that borrow containers. For a trivial example. Imagine an |
operation over two containers whose containers are disjoint.
I think the best design for this would be making RoaringBitmap
generic over it's container type. The default would use Arc
s to share containers under the hood. Users would not need to interact with the type parameter unless they want to, similar to the S
type in Hashmap<K, V, S>
That's interesting, however, I fear at least two things:
- Storing the internal stores/containers in an
Arc
orRc
creates one more indirection and could slow down some other operations. - Doing a clone of an
Arc
requires an atomic operation that forces a NUMA synchronization, will it be faster than a dumbmemcpy
for single-threaded benchmarks?
The second point could potentially be addressed by using Rc
s instead, using the Rc::make_mut
method when needed. Maybe the default container type should be something like Unit
that simply memcpy
like the current version and we could let the user decide the custom container they want to use instead i.e. Arc
, Rc
.
In any case, this change could be quite complex to introduce without breaking too much things 😄
Also, if we did it with just ref counting either bitmaps would no longer be Sync
I was thinking of this is an optimization of space, not time.
In any case, this change could be quite complex to introduce without breaking too much things 😄
Another option, rather than abstracting over borrowing containers would be to make the borrows explicit. borrowck
Is pretty good at it's job .