hnswlib icon indicating copy to clipboard operation
hnswlib copied to clipboard

has_deletions == false

Open lockeliu opened this issue 1 year ago • 5 comments

https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/hnswalg.h#L267

“ || has_deletions == false”

There is a problem with this judgment condition, which will cause the loop to exit early .

lockeliu avatar Sep 28 '22 08:09 lockeliu

I think this is done intentionally. See https://github.com/nmslib/hnswlib/pull/344

When has_deletions = true we have deleted elements and our recall is worse than we have no deleted elements. Therefore to improve the accuracy we add additional elements to the results if ((-current_node_pair.first) > lowerBound && (top_candidates.size() == ef)) to return ef elements.

dyashuni avatar Sep 28 '22 12:09 dyashuni

I see. But when has_deletions = false, how to ensure that the results are sufficient ? when remove “ || has_deletions == false”, our results will be more.

lockeliu avatar Sep 28 '22 13:09 lockeliu

"top_candidates.size() == ef " This condition is essentially to solve the problem of less results.

lockeliu avatar Sep 28 '22 13:09 lockeliu

As I understand when has_deletions = false we can not output less than ef elements (the only case when a total number of elements in the index is less than ef).

dyashuni avatar Sep 28 '22 13:09 dyashuni

I see,thanks

lockeliu avatar Sep 28 '22 13:09 lockeliu