k-NN
k-NN copied to clipboard
Auto-filter deleted documents on search for native engines
Description
Recently, we added a lot of filtering capabilities for faiss so that users can pass filters to hnsw, ivf, ivfpq, etc. When searching a segment, we should automatically add deleted documents to this filter, so that faiss can skip these as well instead of returning results and getting filtered out after. Currently, lucene implements this behavior for HNSW: https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java#L236-L238. There was some discussion around this in #1003
Thanks @jmazanec15 for opening up this issue. I was going to open this issue. I think we should definitely implement a deleted documents filtering capability for Faiss.
But before we implement we should optimize the filtering in general for k-NN. I am working on detailing all the different optimization that is required to be done in filtering. Once we have those optimizations in place we should implement this feature.
Moving to 2.15. I dont think we are going to get in 2.14
Moving to 2.16.