NearestNeighbors.jl icon indicating copy to clipboard operation
NearestNeighbors.jl copied to clipboard

Performant way to estimate distances for bulk `inrange` search?

Open Datseris opened this issue 5 years ago • 4 comments

Hi there, I'm doing standard bulk in-range searches via the syntax vec_of_idxs = NearestNeighbors.inrange(tree, queries, r). My data are SVector{D, Float64} (but I'm not sure whether this matters.

At the moment I need the distances of the found neighbors from the queries, even more than the indices. I've noticed that while the above call is spectacularly fast, the code I wrote to get the distances is veeeery slow:

function _NN_get_ds(tree::KDTree, query, idxs)
    if tree.reordered
        ds = [
            evaluate(tree.metric, query, tree.data[
                findfirst(isequal(i), tree.indices)
            ]) for i in idxs]
    else
        ds = [evaluate(tree.metric, query, tree.data[i]) for i in idxs]
    end
end

and now I transform my original code as

    vec_of_idxs = NearestNeighbors.inrange(tree, queries, r)
    vec_of_ds = [ _NN_get_ds(tree, queries[j], vec_of_idxs[j]) for j in 1:length(queries)]

I'm wonder, whether something already exists in this library, that provides these distances in a faster way?

Datseris avatar Dec 03 '20 19:12 Datseris

PS: After doing some testing, the bottleneck is in fact the weird clause I've written if tree.reordered. I obviously shouldn't be using findfirst in such an inner loop... What is the correct way to translate tree.indices to data indices when the tree is reordered?

Datseris avatar Dec 03 '20 20:12 Datseris

I guess you could create the inverse lookup (for all points) a single time and then reuse that.

KristofferC avatar Dec 03 '20 20:12 KristofferC

I guess that makes sense. But how does the tree itself know the correct indices? I mean, for the "skip" predicate version of e.g. knn, the tree needs to somehow know the correct indices as well, no? Can't I obtain them the same way the tree obtains them when it checks to skip or not?

Datseris avatar Dec 03 '20 20:12 Datseris

It just does this https://github.com/KristofferC/NearestNeighbors.jl/blob/2efd998b6bb74d175e6cbbe1c8bc8e775e502f0b/src/tree_ops.jl#L97

KristofferC avatar Dec 03 '20 21:12 KristofferC