Tullio.jl
Tullio.jl copied to clipboard
Better handling of slices f(A[i,:]) etc.
Somewhat unexpectedly, some mapslices-like things already work:
julia> f(scalar, row, num) = sum(row .+ num)/scalar;
julia> s = 2; m = rand(Int8, 4,4); a = rand(1:9, 4);
julia> @tullio z[_,r] := f(s, m[r,:], a[r])
1×4 Array{Float64,2}:
-25.5 -6.0 9.0 182.0
julia> @tullio z[_,r] := f(s, m[r,:], a[r]) threads=false verbose=true
[ Info: running KernelAbstractions CPU actor
1×4 Array{Float64,2}:
-25.5 -6.0 9.0 182.0
julia> f.(s, eachrow(m), a)'
1×4 Adjoint{Float64,Array{Float64,1}}:
-25.5 -6.0 9.0 182.0
Perhaps following https://github.com/SciML/DiffEqGPU.jl/blob/master/src/DiffEqGPU.jl they can be made to work on the GPU too.
And perhaps there should be tests & slightly better support for such things, e.g. to avoid this:
julia> m2 = similar(m, Int);
julia> @tullio m2[:,i] = cumsum(m[i,:])
ERROR: LoadError: unable to infer range of index :
Kernel that works on CPU only: @tullio z[r] := sum(m[r,:]) now gives this:
@kernel function kern(res::AbstractArray{T}, @Const(m), @Const(ax_r), @Const(keep), @Const(final)) where T
(r,) = @index(Global, NTuple)
@views begin
res[r] = if isnothing(keep)
sum(m[r, :])
else
res[r] + sum(m[r, :])
end
end
end