hnswlib icon indicating copy to clipboard operation
hnswlib copied to clipboard

Correlation of `M`, `k` and `number of items` in the index

Open alexzaf7 opened this issue 9 months ago • 1 comments

Hello,

I am creating various indexes with the following options:

  • space: l2
  • vector dimension: 256
  • M: 16
  • ef_construction: 200
  • ef: 200

The number of items in each index is the only thing that can vary, mostly 100 < max_elements < 20000. Notice that the items are all added at once, without performing additions or resizing.

The issue I face is that when I perform a search on an index with a smaller size (~1000), I get back the "Cannot return the results in a contigious 2D array. Probably ef or M is too small" error. I set k to min(max_elements, 2000) so I never request more items than the index size).

The same knn query on an index with more items (~5000) works fine.

After some experimentation, setting M=50 seems to work and regardless of the index size this error is not occuring anymore.

Why the above fixed the issue is still unclear to me and it brings me back to my initial question, if there is any correlation between M, k and the number of items, or any correlation between these parameters I should be aware of.

Thanks

alexzaf7 avatar May 24 '24 13:05 alexzaf7