Added support for kron
Currently using kron with CuArrays returns a warning about scalar indexing. Added a kernel for kron with accompanying higher level function.
Codecov Report
Merging #814 (694039f) into master (4e81f3d) will decrease coverage by
0.38%. The diff coverage is0.00%.
@@ Coverage Diff @@
## master #814 +/- ##
==========================================
- Coverage 78.03% 77.64% -0.39%
==========================================
Files 120 121 +1
Lines 7383 7420 +37
==========================================
Hits 5761 5761
- Misses 1622 1659 +37
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/CUDA.jl | 100.00% <ø> (ø) |
|
| src/kronecker.jl | 0.00% <0.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 4e81f3d...694039f. Read the comment docs.
Is this quicker than just reshaping and broadcasting, with something like this?
function kron2d(A, B)
A4 = reshape(A, 1, size(A,1), 1, size(A,2))
B4 = reshape(B, size(B,1), 1, size(B,2), 1)
C4 = A4 .* B4
C = reshape(C4, size(A,1) * size(B,1), size(A,2) * size(B,2))
end
A = [1 2 3 4; 5 6 7 8]; B = [1 10; 100 1000];
kron(A,B) == kron2d(A,B)
Is this quicker than just reshaping and broadcasting, with something like this?
function kron2d(A, B) A4 = reshape(A, 1, size(A,1), 1, size(A,2)) B4 = reshape(B, size(B,1), 1, size(B,2), 1) C4 = A4 .* B4 C = reshape(C4, size(A,1) * size(B,1), size(A,2) * size(B,2)) end A = [1 2 3 4; 5 6 7 8]; B = [1 10; 100 1000]; kron(A,B) == kron2d(A,B)
Your function actually executes faster on my machine.
Your function actually executes faster on my machine.
A custom kernel seems unnecessary then?
Your function actually executes faster on my machine.
A custom kernel seems unnecessary then?
I agree - honestly didn't know reshape would work this way.
Even though we don't need a custom kernel, we'd still need to provide kron with that implementation, so this didn't have to be closed.
Would this be better added to a pre-existing file versus a standalone kronecker.jl then?
I agree - honestly didn't know reshape would work this way.
You may be interested in TensorCast.jl, my package for taking this seriously.
It's possible that some broadcasting implementation should be in Base, with the present implementation (with explicit loops) limited to Array or something -- it still seems faster on small CPU arrays. Not a short-term solution, of course. I think it would also solve https://github.com/JuliaLang/julia/issues/40439.