julia
julia copied to clipboard
RFC: `stack(::Slices)` is `permutedims`
Some operations on Slices from https://github.com/JuliaLang/julia/pull/32310 can be more efficiently done on the parent array. Should we add shortcuts, and how widely? The obvious candidate is reductions but the problem is this:
julia> x = rand(1:99, 2,3);
julia> sum(eachcol(x)) == vec(sum(x, dims=2)) # adds whole arrays, inefficient
true
julia> prod(eachcol(x))
ERROR: MethodError: no method matching *(::SubArray, ::SubArray)
julia> identity.(0 .+ prod.(eachrow(x))) == vec(prod(x, dims=2)) # should this un-fuse?
true
and (edit) another problem is this:
julia> using CUDA
julia> prod(cu(x); dims=2) isa CuArray # whole-array operation
julia> prod.(eachcol(cu(x))) isa Array # iterating on CPU
This PR tries something which seems safer, sending stack(eachslice(x)) to permutedims. Not always much of a speed improvement, but sometimes it is:
julia> let n = 500
x = randn(n,n,n)
a = @btime stack(eachslice($x, dims=2), dims=1)
b = @btime permutedims($x, (2,1,3))
a == b
end
945.207 ms (2 allocations: 953.67 MiB)
203.902 ms (2 allocations: 953.67 MiB)
true
Draft, needs tests, and possibly located in the wrong file.
Possibly introduces type-instabilities in its present form. It would be nicer if this could dispatch on the type of Slices to know whether or not it can use permutedims, but as far as I can see it cannot.