Leonid Boytsov comments

Results 250 comments of


                                            Leonid Boytsov

Applicability of HNSW for maximum inner product search

@OriKatz I have actually had my own experiments (on very different data) with the same reduction and it didn't work out well for me either. Bugs are always possible though.

Applicability of HNSW for maximum inner product search

@yurymalkov what if you have a lot of such points:

Applicability of HNSW for maximum inner product search

@OriKatz I would think that graph-based retrieval is a very stable approach in general. It often works very well on weird datasets. However, for the popular-element problem there might be...

Applicability of HNSW for maximum inner product search

@yurymalkov I don't suggest placing popular elements in top layers :-)

Applicability of HNSW for maximum inner product search

@OriKatz I've some success building a graph using a slightly different metric than the original one. I used it in my thesis and in a follow-up publication. If the indexing...

Applicability of HNSW for maximum inner product search

PS: and there's always of course an option to index popular items separately in a second index.

Applicability of HNSW for maximum inner product search

@yurymalkov ML fairness people would eat you alive 😃

multi-node parallel hnsw

Hi @jianshu93 it's not clear what you mean by the distributed computation. In the most common scenario, which is called sharding, the database is split into K chunks that queried...

multi-node parallel hnsw

@jianshu93 if the search is **perfect** than getting top-k results from each of the K-shards **provably** retrieves top-k of the complete collection. The reason is simple: imagine some number k1

L2 and dot product space are inconsistent

@h-shahidi you are asking about nmslib, not hnswlib. First of all, using random embeddings of dim 100 for benchmarking is a very bad idea, because they won't be searched very...