CRoaring icon indicating copy to clipboard operation
CRoaring copied to clipboard

Feature request: intersection, union, and difference with the tail of another bitmap

Open tudor opened this issue 1 year ago • 4 comments

I have two bitmaps a and b, I find one point inside b (say using an iterator), and then I want to intersect / union / compute the set difference of a with the tail of b after that point.

That is, I'd like the equivalent of

b.removeRange(0, x);
a &= b;

without the overhead of mutating b (the memmove from array containers shows up meaningfully in my CPU profile).

Roaring iterators already have all the information needed to implement this efficiently, and I sometimes need to walk through a to find the right spot (it's not as easy as rank(x)) so I'd prefer this:

auto it = b.begin();
it.move(x);
a.intersectTail(it);
// intersectRange(it, b.end()) would be nice too, but that means you have to be able to 
// deal with partial containers at both ends instead of just at the beginning so it's a bit
// harder to implement

Do you think this is worth implementing?

Thanks, -Tudor.

tudor avatar Feb 09 '24 19:02 tudor

I think we could support functions such as...

  • roaring_bitmap_and_inplace_range(a, b, start, end)
  • roaring_bitmap_andnot_inplace_range(a, b, start, end)
  • roaring_bitmap_or_inplace_range(a, b, start, end)
  • roaring_bitmap_xor_inplace_range(a, b, start, end)

would be nice too, but that means you have to be able to deal with partial containers at both ends instead of just at the beginning so it's a bit harder to implement

Is it? I suspect that there is a non-trivial fixed cost involved in starting to code this functionality, but the added complexity of handling both ends is probably not nearly 2x. It may take 15% longer, I would estimate. Of course, the result would be far neater.

lemire avatar Feb 09 '24 21:02 lemire

rages in off-by-one errors

tudor avatar Feb 09 '24 21:02 tudor

Also note that we'd have to implement out-of-place versions as well (at least at the container level), as in-place isn't really in-place for COW bitmaps.

tudor avatar Feb 09 '24 21:02 tudor

Right. So it is a non-trivial amount of work, for sure!!! My argument is that once you get started, it is not much more effort to do a symmetrical work.

It can't be done in a minutes, no matter what.

lemire avatar Feb 09 '24 22:02 lemire