MappedArrays.jl
MappedArrays.jl copied to clipboard
`iterate` fallback causes 5X performance drop
Consider a "function" that has two versions. foo3
is just foo2
, but it accepts a function in the first argument the same way that e.g. sum
does.
function foo2(a)
r = nothing
for y in a
if r === nothing
r = y
else
r += y
end
end
return r
end
function foo3(f, a)
r = nothing
for x in a
y = f(x)
if r === nothing
r = y
else
r += y
end
end
return r
end
I personally do not like the pattern, and MappedArrays.jl
gives me a way of factoring out that function so that I can only write foo2
, even if I, say, need to do the computation only on a specific field of each element.
using MappedArrays: mappedarray
const A = [(field=i,) for i = 1:1000]
const f = Base.Fix2(getfield, :field)
foo2(mappedarray(f, A)) == foo3(f, A) || error()
But the performance is not equivalent:
julia> using BenchmarkTools
julia> @benchmark foo2(mappedarray(f, A))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 325.878 ns (0.00% GC)
median time: 409.595 ns (0.00% GC)
mean time: 430.838 ns (0.00% GC)
maximum time: 1.315 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 221
julia> @benchmark foo3(f, A)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 63.348 ns (0.00% GC)
median time: 68.360 ns (0.00% GC)
mean time: 71.576 ns (0.00% GC)
maximum time: 266.343 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 972
The fallback on iterate(::AbstractArray, ...)
appears to be responsible. Defining
import MappedArrays
function Base.iterate(A::MappedArrays.ReadonlyMappedArray, args...)
r = iterate(A.data, args...)
r === nothing && return r
return (A.f(r[1]), r[2])
end
eliminates the gap!
Defining iterate
on MappedArray
is straightforward, too. I'm not sure what to do for the "Multi" types.
(originally from https://github.com/goretkin/FactorPerElementFunctionPerformance and https://discourse.julialang.org/t/factor-out-field-access-of-abstractarray-and-or-iterator-interface/41493)
Might be due to #46, where many trivial operations become non-trivial in the fallback solutions.
Mind to submit a PR?
Sorry, I was not clear in the previous comment; the solution you proposed seems to be a good patch already for this specific issue, PR is welcomed for this.
The root cause might be #46 but that's pretty much difficult to find a generic solution, as you have noted. I don't have a good solution to that, neither.
Any suggestions on what to do with the "Multi" types?
If I understand it correctly, the iterate protocol returns (val, state)
where state
can optionally be a tuple. I think we can iterate on the indices, and bookmarked it in the state = (A, indices_state)
.