NearestNeighbors.jl
NearestNeighbors.jl copied to clipboard
use ArrayOfArrays for return value to reduce the number of allocated arrays
A long standing gripe for me has been that indices and distances are returned as standard nested Array
s. Typically, each inner array hold quite a small number of neighbors so it means that we allocate a large number of small arrays.
Using ArrayOfArrays, these are stored contigously in one large flat array instead.
The difference in allocations can be readily seen:
julia> input = rand(3, 10^6);
julia> tree = KDTree(rand(3, 10^6));
julia> @time knn(tree, input, 5);
1.538003 seconds (2.00 M allocations: 221.253 MiB, 10.03% gc time)
julia> @time knn(tree, input, 5);
1.489310 seconds (98 allocations: 189.884 MiB, 0.29% gc time)
One potential issue with this is that getindex
is slower on AoA so depending on how people use the return value they might encounter performance regressions.
@nanosoldier runtests()
Update on PkgEvalJob KristofferC/NearestNeighbors.jl@048c5a7 vs. KristofferC/NearestNeighbors.jl@bc645d1: Accepted
Update on PkgEvalJob KristofferC/NearestNeighbors.jl@048c5a7 vs. KristofferC/NearestNeighbors.jl@bc645d1: Running
The package evaluation job you requested has completed - possible new issues were detected. The full report is available.