hnswlib icon indicating copy to clipboard operation
hnswlib copied to clipboard

mutuallyConnectNewElement and updatePoint might cause isolated-island vertex

Open Arthur-Bi opened this issue 1 year ago • 0 comments

Hi, I'm modifying HierarchicalNSW so that everything is stored in a high-perf db. And I notice some possible bug.

            // If cur_c is already present in the neighboring connections of `selectedNeighbors[idx]` then no need to modify any connections or run the heuristics.
            if (!is_cur_c_present) {
                if (sz_link_list_other < Mcurmax) {
                    data[sz_link_list_other] = cur_c;
                    setListCount(ll_other, sz_link_list_other + 1);
                } else {
                    // finding the "weakest" element to replace it with the new one
                    dist_t d_max = fstdistfunc_(getDataByInternalId(cur_c), getDataByInternalId(selectedNeighbors[idx]),
                                                dist_func_param_);
                    // Heuristic:
                    std::priority_queue<std::pair<dist_t, tableint>, std::vector<std::pair<dist_t, tableint>>, CompareByFirst> candidates;
                    candidates.emplace(d_max, cur_c);

                    for (size_t j = 0; j < sz_link_list_other; j++) {
                        candidates.emplace(
                                fstdistfunc_(getDataByInternalId(data[j]), getDataByInternalId(selectedNeighbors[idx]),
                                                dist_func_param_), data[j]);
                    }

                    getNeighborsByHeuristic2(candidates, Mcurmax);

In mutuallyConnectNewElement, around the second getNeighborsByHeuristic2, we are adding cur_c to the neighbor list of selectedNeighbors(for convenience, we call this SELECTED_A ). By doing so, we might rm existing neighbor (for convenience, we call this VERTEX_B ) from SELECTED_A's neighborlist.

If VERTEX_B is only connected to SELECTED_A, this rm process will make VERTEX_B unavailable by entry-point.

updatePoint has the same problem.

Arthur-Bi avatar Nov 14 '23 13:11 Arthur-Bi