Dictionaries.jl icon indicating copy to clipboard operation
Dictionaries.jl copied to clipboard

Faster intersection of Indices

Open pevnak opened this issue 2 years ago • 7 comments

I think that the intersection of Indices can be made trivially faster by ordering of arguments.

 a = distinct(rand(Int, 10000))
 b = distinct(rand(Int, 100))
julia> @btime intersect(a, b)
  134.240 μs (8 allocations: 284.55 KiB)
0-element Indices{Int64}

julia> @btime intersect(b, a)
  1.238 μs (5 allocations: 3.94 KiB)
0-element Indices{Int64}

julia> @btime length(a) > length(b) ? intersect(b, a) : intersect(a, b)
  1.283 μs (6 allocations: 3.95 KiB)
0-element Indices{Int64}

Would it make sense?

pevnak avatar Oct 06 '21 19:10 pevnak

This did the job for me

function Base.intersect(s1::AbstractIndices, s2::AbstractIndices)
    length(s1) > length(s2) ? filter!(in(s2), copy(s1)) : filter!(in(s1), copy(s2))
end

pevnak avatar Oct 06 '21 19:10 pevnak

Thanks @pevnak

One issue I had was trying to get the order of elements to come out the same either way. I don’t know if that is important, but so far I’ve been conservative with ordering issues.

Not sure what other people’s opinions are?

andyferris avatar Oct 06 '21 21:10 andyferris

I cannot judge that. I am usually concerned with speed.

pevnak avatar Oct 07 '21 03:10 pevnak

Since there is a stable_sort and sort, would it make sense to have something similar? I apologize for abusing the comments, but the difference in speed I am getting in my application is enormous.

I guess it might be possible to implement filtering for the opposite case, but it might be awful. One would first invalidate everything and then turn on only items that are in the other set. In this way, I guess, the iteration might go over the other set. I guess you got my idea, but not sure if it something that should be done.

pevnak avatar Oct 07 '21 04:10 pevnak

Yes perhaps.

I apologize for abusing the comments, but the difference in speed I am getting in my application is enormous.

There's no abuse here :) I, too, take performance seriously - we'll get this sorted.

One consideration might be, when length(s1) > length(s2), to record the tokens in s1, and sort them and give that back.

andyferris avatar Oct 07 '21 04:10 andyferris

Another complication is the fact that either collection might be SizeUnknown (i.e. a FilteredIndices) so even length can be expensive.

andyferris avatar Oct 07 '21 04:10 andyferris

But it can be narrowed for types where this will be performant.

pevnak avatar Oct 07 '21 05:10 pevnak