roaring-rs icon indicating copy to clipboard operation
roaring-rs copied to clipboard

Optimization(?): Clone-on-write containers

Open saik0 opened this issue 3 years ago • 2 comments

Cow would allow optimizations that borrow containers. For a trivial example. Imagine an | operation over two containers whose containers are disjoint.

I think the best design for this would be making RoaringBitmap generic over it's container type. The default would use Arcs to share containers under the hood. Users would not need to interact with the type parameter unless they want to, similar to the S type in Hashmap<K, V, S>

saik0 avatar Jan 23 '22 05:01 saik0

That's interesting, however, I fear at least two things:

  1. Storing the internal stores/containers in an Arc or Rc creates one more indirection and could slow down some other operations.
  2. Doing a clone of an Arc requires an atomic operation that forces a NUMA synchronization, will it be faster than a dumb memcpy for single-threaded benchmarks?

The second point could potentially be addressed by using Rcs instead, using the Rc::make_mutmethod when needed. Maybe the default container type should be something like Unit that simply memcpy like the current version and we could let the user decide the custom container they want to use instead i.e. Arc, Rc.

In any case, this change could be quite complex to introduce without breaking too much things 😄

Kerollmops avatar Jan 23 '22 14:01 Kerollmops

Also, if we did it with just ref counting either bitmaps would no longer be Sync

I was thinking of this is an optimization of space, not time.

In any case, this change could be quite complex to introduce without breaking too much things 😄

Another option, rather than abstracting over borrowing containers would be to make the borrows explicit. borrowck Is pretty good at it's job .

saik0 avatar Jan 23 '22 16:01 saik0