Leonid Boytsov
Leonid Boytsov
PS: to reiterate/clarify, the dot-product-to-L2 transformation preserves the relationship between query and data points, but not between non-query data points. I hope this makes sense.
What's exactly sorting here?
@fcjy do you have the same issue with NMSLIB implementation? PS: HNSW doesn't always work. I have an example where it doesn't work with the cosine similarity and with the...
@fcjy could you share your data with me to? (leo at boytsov.info). A one to 10 million sample will be enough. Many thanks!
@wuwenjunwwj the inner product search is likely more challenging. It's not the same search problem. Plus, for topK=5000 it's difficult to get accurate results with k-NN search overall.
@fcjy do you still have that old data set? I tried to download it recently, but clearly it's not there any more :-)
@alpinejoe and @yurymalkov could anybody point me to the exact point where buffer overrun is happening?
@suimo I believe HNSWLIB should have similar performance as HNSW in NMSLIB in terms of speed, but indexing is faster. However, I am getting 0.01ms at 90% recall. Also NMSLIB...
@yurymalkov if I remember correctly you tested on random 4d. And numbers were pretty close. Actual 3d data has lower intrinsic dimensionality, which gives tree-based methods an edge.
@suimo if you are using OpenMP, it's not a fair comparison, of course. It is hard to find now a server that would have fewer than 8 actual cores! If...