Kaival Parikh comments

Results 37 comments of


                                            Kaival Parikh

Enable Faiss-based vector format to index larger number of vectors in a single segment

@mikemccand I stumbled upon a way to allocate a `long[]` in native memory using a specific byte order (`LITTLE_ENDIAN`) -- which we use in a filtered search (i.e. if an...

Enable Faiss-based vector format to index larger number of vectors in a single segment

> entry in `CHANGES.txt` Thanks @mikemccand, I thought it was a follow-up to the original PR adding the codec, and may not need a separate entry -- but I've added...

Support multiple HNSW graphs backed by the same vectors

Thank you for your input everyone! > I'm wondering if ACORN would work for this use case @dungba88 while ACORN may speed up the graph-search component of a pre-filtered search...

Support multiple HNSW graphs backed by the same vectors

### What if we de-duplicate vectors in Lucene? - Today, we have a [`Lucene99FlatVectorsFormat`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsFormat.java) responsible for reading / writing raw vectors - This format [maintains a list](https://github.com/apache/lucene/blob/ac90517c17ef78a469c65868e2026461f6ddcddc/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99FlatVectorsWriter.java#L408) of vectors _per-field_...

Support multiple HNSW graphs backed by the same vectors

> What if it only did so within one document, which would enable this "compile KNN prefilter to separate field's HNSW graph during indexing" efficiently? But not across documents Thanks...

Support multiple HNSW graphs backed by the same vectors

@mikemccand I was able to hack luceneutil to perform the following benchmark: - Take an additional input `filterFactor`, where documents with `ID % filterFactor == 0` are considered "live" (so...

Support multiple HNSW graphs backed by the same vectors

Thank you for your inputs @mikemccand @benwtrent :) > Let's get your luceneutil changes merged -- this is useful for benchmarking I had it in a shared branch earlier, opened...

Integrating GPU based Vector Search using cuVS

Exciting change! Since this PR adds a new codec for vector search, I wanted to point to #14178 along similar lines -- adding a new Faiss-based KNN format to index...

Examine adding more off-heap vector scoring

FYI I opened #14863 for off-heap quantized scoring, would appreciate reviews!

Feature/scalar quantized off heap scoring

+1 to this feature I work on Amazon product search, and in one of our searchers we see a high proportion of CPU cycles within HNSW search being spent in...