NearestNeighbors.jl icon indicating copy to clipboard operation
NearestNeighbors.jl copied to clipboard

Querying number of distance evaluations

Open fcdimitr opened this issue 2 years ago • 3 comments

Is there a way to determine the number of distance calculations performed during the kNN computation? The brute-force approach requires evaluating the distances between every query and corpus point, resulting in $n_q \times n_c$ calculations. However, methods like KDTree or BallTree can potentially reduce this number, especially in low-dimensional spaces.

Does the package provide a mechanism to retrieve the actual number of distance evaluations during the kNN computation?

Thank you!

fcdimitr avatar Jun 25 '23 20:06 fcdimitr

That's a good idea!

There used to be a "debug" mode that you could activate that added some stats but it was removed in https://github.com/KristofferC/NearestNeighbors.jl/pull/98/files#diff-3d112593fe19c53f85f19846cc9d0406c17b39736378b65e7d12821d20207302.

With more experienced eyes, I think a better alternative is to allow one to pass something like a

struct EvaluationData
    n_distance_evals
    n_tree_intersection_evals
    ...
end

to inrange and knn and have that struct be populated. By default that argument would just be nothing and unused.

KristofferC avatar Nov 29 '23 20:11 KristofferC

I made a minor modification on my fork to include a global counter:

https://github.com/KristofferC/NearestNeighbors.jl/compare/master...fcdimitr:NearestNeighbors.jl:master

It works correctly, and some preliminary benchmarking shows that it does not affect the running time.

In any case, I agree that a more structured approach to collecting statistics should be preferred.

fcdimitr avatar Nov 29 '23 20:11 fcdimitr

Another thing that can be used is GFlops.jl (requires the https://github.com/charleskawczynski/GFlops.jl/tree/ck/julia1.10 branch).

julia> tree = BallTree(rand(3,10^5));

julia> v = rand(3)
3-element Vector{Float64}:
 0.38248598898980024
 0.6378296367207168
 0.9255061078525778

julia> @count_ops knn(tree, v, 4)
Flop Counter: 9124 flop
┌──────┬─────────┐
│      │ Float64 │
├──────┼─────────┤
│  add │    2184 │
│  sub │    3300 │
│  mul │    2912 │
│ sqrt │     728 │
└──────┴─────────┘

KristofferC avatar Jun 29 '24 14:06 KristofferC