InvertedIndices.jl
InvertedIndices.jl copied to clipboard
Bad performance caused by type unstable `iterate`
I tried to compare the performance of indexing :
function benchindexing(a, not)
print("indexing by `!in(not)`: ")
@btime $a[map(!in($not), axes($a, 1))]
print("indexing by `!in(Set(not))`:")
@btime $a[map(!in(Set($not)), axes($a, 1))]
print("indexing by `Not(not)`: ")
@btime $a[Not($not)]
return nothing
end
For most of case, Not(not)
is much slower than map(!in(not), axes(a, 1))
:
julia> benchindexing(rand(100), 1)
indexing by `!in(not)`: 224.546 ns (2 allocations: 1.06 KiB)
indexing by `!in(Set(not))`: 679.196 ns (6 allocations: 1.53 KiB)
indexing by `Not(not)`: 13.666 μs (497 allocations: 16.34 KiB)
julia> benchindexing(rand(100), 1:10)
indexing by `!in(not)`: 229.121 ns (2 allocations: 1008 bytes)
indexing by `!in(Set(not))`: 722.948 ns (6 allocations: 1.45 KiB)
indexing by `Not(not)`: 12.292 μs (452 allocations: 14.89 KiB)
julia> benchindexing(rand(100), collect(1:10))
indexing by `!in(not)`: 598.545 ns (2 allocations: 1008 bytes)
indexing by `!in(Set(not))`: 712.652 ns (6 allocations: 1.45 KiB)
indexing by `Not(not)`: 12.500 μs (452 allocations: 14.89 KiB)
julia> benchindexing(rand(100), collect(1:50))
indexing by `!in(not)`: 1.483 μs (2 allocations: 688 bytes)
indexing by `!in(Set(not))`: 1.087 μs (9 allocations: 2.35 KiB)
indexing by `Not(not)`: 7.083 μs (252 allocations: 8.33 KiB)
julia> benchindexing(rand(100), collect(1:100))
indexing by `!in(not)`: 2.176 μs (2 allocations: 272 bytes)
indexing by `!in(Set(not))`: 1.454 μs (9 allocations: 3.07 KiB)
indexing by `Not(not)`: 285.223 ns (5 allocations: 224 bytes)
julia> benchindexing(rand(10000), collect(1:5000))
indexing by `!in(not)`: 11.821 ms (3 allocations: 49.08 KiB)
indexing by `!in(Set(not))`: 200.125 μs (10 allocations: 121.74 KiB)
indexing by `Not(not)`: 735.792 μs (35003 allocations: 976.67 KiB)
julia> benchindexing(rand(10000), collect(1:8000))
indexing by `!in(not)`: 15.105 ms (2 allocations: 25.69 KiB)
indexing by `!in(Set(not))`: 181.125 μs (9 allocations: 170.35 KiB)
indexing by `Not(not)`: 309.500 μs (14002 allocations: 390.78 KiB)
julia> benchindexing(rand(10000), collect(1:10000))
indexing by `!in(not)`: 15.769 ms (2 allocations: 10.02 KiB)
indexing by `!in(Set(not))`: 218.042 μs (9 allocations: 154.68 KiB)
indexing by `Not(not)`: 16.250 μs (5 allocations: 224 bytes)
There are a much more allocations, which are caused by collect(::Base.Generator{InvertedIndices.InvertedIndexIterator{...})
:
julia> @btime collect(itr) setup=(itr=(i for i in to_indices(rand(100), (Not(1:50),))[1]));
6.990 μs (253 allocations: 8.36 KiB)
The code_warntype
shows:
julia> code_warntype(collect, (Base.Generator{InvertedIndices.InvertedIndexIterator{Int64, UnitRange{Int64}, Base.OneTo{Int64}}, typeof(identity)},))
Variables
#self#::Core.Const(collect)
itr::Base.Generator{InvertedIndices.InvertedIndexIterator{Int64, UnitRange{Int64}, Base.OneTo{Int64}}, typeof(identity)}
@_3::Int64
arr::Vector{Int64}
st::Tuple{Union{Nothing, Tuple{Int64, Int64}}, Union{Nothing, Tuple{Int64, Int64}}}
v1::Int64
y::Union{Nothing, Tuple{Int64, Tuple{Union{Nothing, Tuple{Int64, Int64}}, Union{Nothing, Tuple{Int64, Int64}}}}}
shape::Tuple{Base.OneTo{Int64}}
et::Type{Int64}
isz::Base.HasShape{1}
@_11::Type{Int64}
@_12::Tuple{Base.OneTo{Int64}}
...
where y = iterate(itr)
and st = y[2]
are type unstable.
IMO, the type instability of iterate(::InvertedIndexIterator)
causes the bad performance. Is there any way to fix it?
Besides, I suggest that for Not{S}
where S<:Integer
or S <: AbstractRange
, to_indices
should convert it to a Base.LogicalIndex
by Base.LogicalIndex(map(!in(not), ind)
, because its is really faster:
julia> benchindexing(rand(10000), 1:10000)
indexing by `!in(not)`: 5.972 μs (2 allocations: 10.02 KiB)
indexing by `!in(Set(not))`: 215.750 μs (9 allocations: 154.68 KiB)
indexing by `Not(not)`: 14.625 μs (5 allocations: 224 bytes)
julia> benchindexing(rand(10000), 1:2:10000)
indexing by `!in(not)`: 69.958 μs (3 allocations: 49.08 KiB)
indexing by `!in(Set(not))`: 200.250 μs (10 allocations: 121.74 KiB)
indexing by `Not(not)`: 794.416 μs (39492 allocations: 1.40 MiB)