faiss-rs icon indicating copy to clipboard operation
faiss-rs copied to clipboard

Support binary indexes

Open aalekhpatel07 opened this issue 5 months ago • 5 comments

Hi there!

Thanks for maintaining these bindings!

So, iiuc faiss has had support for binary indexes since v1.6.2 and faiss-rs is built against v1.7.2 so I naturally assumed I'd have faiss::read_index_binary, etc methods available but that doesn't seem to be the case.

I'm not very well-versed with bindings and the whole C++-Rust interop so correct me if I'm wrong but I believe we're currently just not generating bindings for the binary indexes.

If getting this to work might not be a tall order, then would you be interested in guiding me so that I can contribute?

Thanks!

aalekhpatel07 avatar Mar 11 '24 19:03 aalekhpatel07

Thank you for your interest in these bindings!

While this crate builds against a version of Faiss with support for binary indexes, it is worth remembering that it targets the C API exclusively. Faiss did not support binary indexes when I first worked on the C API, and this API is not updated in lockstep with the rest of the library. In other words, the work on extending faiss-rs to support binary indexes needs to start at the main Faiss repository. I happen to have found a tracking issue. Should you be interested in fulfilling this, I can provide some assistance there.

Enet4 avatar Mar 24 '24 09:03 Enet4

Oh, my mistake, turns out that Faiss already has its C API extended to support binary indexes. So at this end, what we need to do is:

  1. Checkout a more recent version of the Faiss library;
  2. Regenerate the low-level bindings in faiss-sys;
  3. Design a high level API here in the faiss crate.

Enet4 avatar Mar 24 '24 10:03 Enet4

Thanks so much for initiating this work in #77. I would love to be able to help out with this endeavour. I'll base my changes on top of change/faiss-v1.8.0, and try to design the unsafe boundaries following patterns from src/index/*.rs.

Not sure if I'm tall enough to write unsafe Rust but I'll give it a shot anyway. Your reviews and guidance will always be appreciated!

aalekhpatel07 avatar Mar 24 '24 16:03 aalekhpatel07

Okay! There's some careful copypasta work in #79 that covers the base BinaryIndexImpl and all the core *Index* traits and macros are replicated and adapted for BinaryIndexes. Not so sure about the naming convention we wanna go ahead with but my current assumption is to prefix the concrete impls with Binary (i.e. IndexFlat -> BinaryIndexFlat).

PS: This is my first foray into getting C++ to behave nicely with Rust so some things could be broken but I have to say this project is so well written/maintained that suprisingly, I'm having a pleasant time getting it all working together (barring my broken cublas install on Fedora 38 which forces me to generate bindings on a nvidia/cuda:12.3.2-devel-ubi8 docker container). Also, cargo build seems to be happy with these changes so maybe I'm headed in the right direction after all?

aalekhpatel07 avatar Mar 24 '24 18:03 aalekhpatel07

On a second thought, I'll keep the naming convention the same as the C++ api (i.e. IndexFlat -> IndexBinaryFlat)

aalekhpatel07 avatar Mar 24 '24 18:03 aalekhpatel07