vortex icon indicating copy to clipboard operation
vortex copied to clipboard

Reduce list contains with binary search

Open gatesn opened this issue 7 months ago • 1 comments

gatesn avatar May 19 '25 20:05 gatesn

Thanks again for looking at this!

FWIW: I have an (unstable) sketch of this that looks like:

    let mut mask = BooleanBufferBuilder::new(ends.len() - 1);

    // Sparse array: binary search in offsets and set individual bits.
    let mut set_bits = matches.set_indices();
    let mut ends_offset = 0;
    mask.resize(ends.len() - 1);
    while let Some(bit) = set_bits.next() {
        if bit < ends[ends_offset].as_() {
            // This bit is in a list that we have already visited.
            continue;
        }

        // This bit is in a list that we haven't visited yet.
        let idx = match (&ends[ends_offset..]).binary_search_by_key(&bit, |t| t.as_()) {
            Ok(idx) => idx,
            Err(idx) => idx - 1,
        };
        mask.set_bit(ends_offset + idx, true);
        ends_offset += idx + 1;
    }

    BoolArray::new(mask.into(), validity).into_array()

... because it shrank the domain of the binary search on each iteration.

stuhood avatar May 19 '25 22:05 stuhood