hnswlib
hnswlib copied to clipboard
valgrind problems when searching a reloaded index
I recently attempted to update the R bindings for hnswlib but there is a valgrind problem of the form:
==217201== Conditional jump or move depends on uninitialised value(s)
==217201== at 0x188D9AB5: hnswlib::HierarchicalNSW<float>::searchKnn(void const*, unsigned long) const (packages/tests-vg/RcppHNSW/src/../inst/include/hnswalg.h:1150)
This should also be triggerable via the second Python bindings example on the current README (the code after "An example with updates after serialization/deserialization").
The problem line: https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/hnswalg.h#L1150
I believe the issue is that not all HierarchicalNSW
constructors are initializing all of their data members. Specifically num_deleted_
is a problem: while it can be incremented in loadIndex
under circumstances where isMarkDeleted
returns true, it's never actually assigned. In the example above, there isn't any deletion so the problem doesn't become apparent until searchKnn
is invoked.
I haven't checked for the validity of other class member variables so other problems may be lurking. I can certainly help with providing a PR or helping with testing, but I haven't updated the R bindings for a couple of years so more recent functionality isn't tested (yet). So this will for sure need some extra eyes on it.
Hi @jlmelville,
Yeah, it does not seem right. A PR would be awesome, I can review it. Please let me know if there are other options for me to help.
I am making a start on this, but leaving some notes here in case I fall victim to the usual combination of too many good intentions and too little free time.
I used clang-tidy
to hunt down any issues around non-initialized fields in constructors (ignoring all the modernizing/readability/style checks). It found these issues:
bruteforce.h
https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/bruteforce.h#L11 and https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/bruteforce.h#L14
constructor does not initialize these fields: data_, maxelements_, cur_element_count, size_per_element_, data_size_, dist_func_param_
hnswalg.h
https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/hnswalg.h#L20 and https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/hnswalg.h#L23
constructor does not initialize these fields: max_elements_, cur_element_count, size_data_per_element_, size_links_per_element_, num_deleted_, M_, maxM_, maxM0_, ef_construction_, mult_, revSize_, maxlevel_, visited_list_pool_, enterpoint_node_, size_links_level0_, offsetData_, offsetLevel0_, data_level0_memory_, linkLists_, data_size_, label_offset_, dist_func_param_, metric_distance_computations, metric_hops, ef_
https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/hnswalg.h#L27
constructor does not initialize these fields: metric_distance_computations, metric_hops
Odd but unrelated things clang-tidy noted
These are not related to the constructor issue so I wouldn't put them in a PR for this issue, but may be worth looking at @yurymalkov:
In https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/hnswalg.h#L387:
parameter 'data_point' is unused
In https://github.com/nmslib/hnswlib/blob/443d667478fddf1e13f2e06b1da4e1ec3a9fe716/hnswlib/bruteforce.h#L26
suspicious exception object created but not thrown; did you mean 'throw runtime_error'?
Thanks for pointing out the problems! I'll fix them after merging #396