k-NN icon indicating copy to clipboard operation
k-NN copied to clipboard

Auto-filter deleted documents on search for native engines

Open jmazanec15 opened this issue 1 year ago • 4 comments
trafficstars

Description

Recently, we added a lot of filtering capabilities for faiss so that users can pass filters to hnsw, ivf, ivfpq, etc. When searching a segment, we should automatically add deleted documents to this filter, so that faiss can skip these as well instead of returning results and getting filtered out after. Currently, lucene implements this behavior for HNSW: https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99HnswVectorsReader.java#L236-L238. There was some discussion around this in #1003

jmazanec15 avatar Feb 21 '24 17:02 jmazanec15

Thanks @jmazanec15 for opening up this issue. I was going to open this issue. I think we should definitely implement a deleted documents filtering capability for Faiss.

But before we implement we should optimize the filtering in general for k-NN. I am working on detailing all the different optimization that is required to be done in filtering. Once we have those optimizations in place we should implement this feature.

navneet1v avatar Feb 22 '24 18:02 navneet1v

Moving to 2.15. I dont think we are going to get in 2.14

jmazanec15 avatar Apr 29 '24 15:04 jmazanec15

Moving to 2.16.

vamshin avatar Jun 13 '24 22:06 vamshin