julia icon indicating copy to clipboard operation
julia copied to clipboard

RFC: `stack(::Slices)` is `permutedims`

Open mcabbott opened this issue 3 years ago • 0 comments

Some operations on Slices from https://github.com/JuliaLang/julia/pull/32310 can be more efficiently done on the parent array. Should we add shortcuts, and how widely? The obvious candidate is reductions but the problem is this:

julia> x = rand(1:99, 2,3);

julia> sum(eachcol(x)) == vec(sum(x, dims=2))  # adds whole arrays, inefficient
true

julia> prod(eachcol(x))
ERROR: MethodError: no method matching *(::SubArray, ::SubArray)

julia> identity.(0 .+ prod.(eachrow(x))) == vec(prod(x, dims=2))  # should this un-fuse?
true

and (edit) another problem is this:

julia> using CUDA

julia> prod(cu(x); dims=2) isa CuArray  # whole-array operation

julia> prod.(eachcol(cu(x))) isa Array  # iterating on CPU

This PR tries something which seems safer, sending stack(eachslice(x)) to permutedims. Not always much of a speed improvement, but sometimes it is:

julia> let n = 500
           x = randn(n,n,n)
           a = @btime stack(eachslice($x, dims=2), dims=1)
           b = @btime permutedims($x, (2,1,3))
           a == b
       end
  945.207 ms (2 allocations: 953.67 MiB)
  203.902 ms (2 allocations: 953.67 MiB)
true

Draft, needs tests, and possibly located in the wrong file.

Possibly introduces type-instabilities in its present form. It would be nicer if this could dispatch on the type of Slices to know whether or not it can use permutedims, but as far as I can see it cannot.

mcabbott avatar Nov 13 '22 22:11 mcabbott