Kaival Parikh
Kaival Parikh
@mikemccand I stumbled upon a way to allocate a `long[]` in native memory using a specific byte order (`LITTLE_ENDIAN`) -- which we use in a filtered search (i.e. if an...
> entry in `CHANGES.txt` Thanks @mikemccand, I thought it was a follow-up to the original PR adding the codec, and may not need a separate entry -- but I've added...
Thank you for your input everyone! > I'm wondering if ACORN would work for this use case @dungba88 while ACORN may speed up the graph-search component of a pre-filtered search...
### What if we de-duplicate vectors in Lucene? - Today, we have a [`Lucene99FlatVectorsFormat`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsFormat.java) responsible for reading / writing raw vectors - This format [maintains a list](https://github.com/apache/lucene/blob/ac90517c17ef78a469c65868e2026461f6ddcddc/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsWriter.java#L408) of vectors _per-field_...
> What if it only did so within one document, which would enable this "compile KNN prefilter to separate field's HNSW graph during indexing" efficiently? But not across documents Thanks...
@mikemccand I was able to hack luceneutil to perform the following benchmark: - Take an additional input `filterFactor`, where documents with `ID % filterFactor == 0` are considered "live" (so...
Thank you for your inputs @mikemccand @benwtrent :) > Let's get your luceneutil changes merged -- this is useful for benchmarking I had it in a shared branch earlier, opened...
Exciting change! Since this PR adds a new codec for vector search, I wanted to point to #14178 along similar lines -- adding a new Faiss-based KNN format to index...
FYI I opened #14863 for off-heap quantized scoring, would appreciate reviews!
+1 to this feature I work on Amazon product search, and in one of our searchers we see a high proportion of CPU cycles within HNSW search being spent in...