hnswlib icon indicating copy to clipboard operation
hnswlib copied to clipboard

c++ use multiple threads but much slower than python

Open bluemandora opened this issue 2 years ago • 3 comments

I use ParallelFor as https://github.com/nmslib/hnswlib/blob/7cc0ecbd43723418f43b8e73a46debbbc3940346/python_bindings/bindings.cpp#L239

// c++ code, add 10000 points, addPoint cost 7580ms
// compile flags:  -std=c++11 -g -pipe -W -Wall -fPIC -pthread -Ofast -fwrapv
int d = 256;
hnswlib::labeltype n = 10000;
   
std::vector<float> data(n * d);
std::mt19937 rng;
rng.seed(47);
std::uniform_real_distribution<> distrib;

for (hnswlib::labeltype i = 0; i < n * d; ++i) {
    data[i] = distrib(rng);
}

hnswlib::L2Space space(d);
hnswlib::AlgorithmInterface<float>* alg_hnsw = new hnswlib::HierarchicalNSW<float>(&space, 2 * n);
int num_threads = std::thread::hardware_concurrency();
ParallelFor(0, n, num_threads, [&](size_t row, size_t threadId) {
                    alg_hnsw->addPoint((void *) data.data() + d * row, (size_t) row);
});
# python code, add 10000 points, add_items cost 310ms
import numpy as np
import hnswlib
import time
dim = 256
num_elements = 10000
data = np.float32(np.random.random((num_elements, dim)))
ids = np.arange(num_elements)
p = hnswlib.Index(space = 'ip', dim = dim)
p.init_index(max_elements = num_elements, ef_construction = 200, M = 16)
start=time.time();p.add_items(data, ids);end=time.time();

Is there anything i missed in c++ code? Why c++ is mush slower?

bluemandora avatar Oct 18 '22 11:10 bluemandora