raft icon indicating copy to clipboard operation
raft copied to clipboard

[FEA] Primitives for pre-filtering and post-filtering

Open cjnolet opened this issue 2 years ago • 2 comments

The IVF methods currently accept pre-filtering functions that can be applied during scan but still need optimized primitives that can allow users to efficiently express and perform the filtering logic. Post-filtering can also be done after the fact and could likely benefit from the same or similar set of optimized primitives.

So far we've discussed simple bitsets, roaring bitmaps, bloom filters, and hash table primitives to build filtering functions on top of. Initially, these APIs could probably accept an array of ids on device or host and produce a data structure from which a filter function can be produced which can be passed directly into search_with_filter. I think it will be important to consider the API design up front so we can provide a unified API experience as much as possible.

Another design detail to consider is that these primitives should be able to produce a filtering function that can work with both pre-filtering and post-filtering.

cjnolet avatar Aug 15 '23 17:08 cjnolet

Taks list so far:

  • [x] biset
  • [ ] bitmap
  • [ ] bloom filter
  • [ ] hash table
  • [ ] hashmap

cjnolet avatar Sep 26 '23 20:09 cjnolet

Linking discussion about template parameters for filters: https://github.com/rapidsai/raft/pull/2212#issuecomment-1979439358, and related task

  • [ ] avoid recompiling ivf:pq::search kernels when index type changes

tfeher avatar Mar 10 '24 20:03 tfeher