faiss-rs icon indicating copy to clipboard operation
faiss-rs copied to clipboard

Slow results

Open BerserkerMother opened this issue 1 year ago • 2 comments

Hi, I have checked the package and tested search both in Python and Rust; however, Python version is significantly faster. I am using Ultra 5 chip.

BerserkerMother avatar Aug 26 '24 15:08 BerserkerMother

There isn't anything about the Rust bindings that would influence performance so drastically, so this is likely a situation with linking against a version of the library with less optimizations or a reduced instruction set. If you can describe how you built faiss-rs and from where you got the Python version, we can draw some conclusions.

Enet4 avatar Aug 27 '24 08:08 Enet4

Thank you for your response and the great work. Using archlinux, first I did:

sudo pacman -Sy intel-oneapi-mkl

then

cmake -B build -DFAISS_ENABLE_GPU=OFF -DFAISS_ENABLE_C_API=ON -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_PYTHON=OFF -DMKL_LIBRARIES=/opt/intel/mkl/lib/intel64/libmkl_rt.so
make -C build -j 4   
cd build
sudo make install

In the project directory, cargo add faiss and the code:

use std::time;

use faiss::{index_factory, Index, MetricType};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let my_data = vec![1f32 / 1000.0; 256 * 100000];
    let mut index = index_factory(256, "Flat", MetricType::L2)?;
    index.add(&my_data)?;
    let my_query = vec![2f32 / 1000.0; 256 * 10000];
    let start = time::Instant::now();
    let result = index.search(&my_query, 5)?;
    let duration = start.elapsed();
    // for (i, (l, d)) in result
    //     .labels
    //     .iter()
    //     .zip(result.distances.iter())
    //     .enumerate()
    // {
    //     println!("#{}: {} (D={})", i + 1, *l, *d);
    // }
    println!("{:?}", duration);
    Ok(())
}

takes 3.702652552s, but code

import numpy as np

import time

d = 256                           # dimension
nb = 100000# database size
nq = 10000# nb of queries
np.random.seed(1234)             # make reproducible
xb = np.random.random((nb, d)).astype('float32')
xb[:, 0] += np.arange(nb) / 1000.
xq = np.random.random((nq, d)).astype('float32')
xq[:, 0] += np.arange(nq) / 1000.

import faiss                   # make faiss available
index = faiss.IndexFlatL2(d)   # build the index
print(index.is_trained)
index.add(xb)                  # add vectors to the index
print(index.ntotal)

k = 10                          # we want to see 4 nearest neighbors
start = time.time()
D, I = index.search(xq, k)     # actual search
print(I[:5])                   # neighbors of the 5 first queries
print(I[-5:])                  # neighbors of the 5 last queries
duration = time.time() - start
print(duration * 1000)

takes 1.55.612159729004s. For Python version I just did pip install faiss-cpu.

BerserkerMother avatar Aug 27 '24 09:08 BerserkerMother