ArrayLayouts.jl
ArrayLayouts.jl copied to clipboard
The example in the README is confusing
The README shows the following example:
julia> using ArrayLayouts
julia> A = randn(10_000,10_000); x = randn(10_000); y = similar(x);
julia> V = view(Symmetric(A),:,:)';
julia> @time mul!(y, A, x); # Julia does not recognize that V is symmetric
0.040255 seconds (4 allocations: 160 bytes)
julia> @time muladd!(1.0, V, x, 0.0, y); # ArrayLayouts does and is 3x faster as it calls BLAS
0.017677 seconds (4 allocations: 160 bytes)
Specifically this line:
julia> @time mul!(y, A, x); # Julia does not recognize that V is symmetric
The argument A
is used instead of V
, so the comment is irrelevant, Julia cannot figure out that the matrix is symmetric, because A
matrix is not really symmetric. Also both function should not allocate anything, the reported allocations are possible due to use of global variables in the REPL.
Here is an updated benchmark (with BenchmarkTools
), which does not really show any difference if V
argument is used instead and without global variables. At least on my computer.
@benchmark LinearAlgebra.mul!(y, V, x) setup = begin
rng = StableRNG(123)
y = randn(rng, 10_000)
x = randn(rng, 10_000)
L = randn(rng, 10_000, 10_000)
A = L * L' + I(10_000) # Make an actual symmetric matrix
V = Symmetric(A)
end
@benchmark ArrayLayouts.muladd!(1.0, V, x, 0.0, y) setup = begin
rng = StableRNG(123)
y = randn(rng, 10_000)
x = randn(rng, 10_000)
L = randn(rng, 10_000, 10_000)
A = L * L' + I(10_000) # Make an actual symmetric matrix
V = Symmetric(A)
end
with the output
BenchmarkTools.Trial: 2 samples with 1 evaluation.
Range (min … max): 9.245 ms … 9.855 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 9.550 ms ┊ GC (median): 0.00%
Time (mean ± σ): 9.550 ms ± 431.100 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
9.24 ms Histogram: frequency by time 9.85 ms <
Memory estimate: 0 bytes, allocs estimate: 0.
BenchmarkTools.Trial: 2 samples with 1 evaluation.
Range (min … max): 9.057 ms … 9.130 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 9.093 ms ┊ GC (median): 0.00%
Time (mean ± σ): 9.093 ms ± 51.707 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
9.06 ms Histogram: frequency by time 9.13 ms <
Memory estimate: 0 bytes, allocs estimate: 0.
A small note to that: there is also no documentation for the muladd!
from ArrayLayouts
even though it is used as a first example. It's hard to guess what argument stores the result of the operation. I assume the usage of !
means that the function mutates its arguments.
Julia Version 1.9.0
Commit 8e630552924 (2023-05-07 11:25 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin22.4.0)
CPU: 10 × Apple M2 Pro
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
Threads: 2 on 6 virtual cores
Environment:
JULIA_IMAGE_THREADS = 1
I noticed, that I forgot the V = view(Symmetric(A),:,:)';
, these are the updated benchmarks:
@benchmark LinearAlgebra.mul!(y, V, x) setup = begin
rng = StableRNG(123)
y = randn(rng, 10_000)
x = randn(rng, 10_000)
L = randn(rng, 10_000, 10_000)
A = L * L' + I(10_000)
V = view(Symmetric(A), :, :)'
end
@benchmark ArrayLayouts.muladd!(1.0, V, x, 0.0, y) setup = begin
rng = StableRNG(123)
y = randn(rng, 10_000)
x = randn(rng, 10_000)
L = randn(rng, 10_000, 10_000)
A = L * L' + I(10_000)
V = view(Symmetric(A), :, :)'
end
with the following output
BenchmarkTools.Trial: 2 samples with 1 evaluation.
Range (min … max): 194.631 ms … 194.917 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 194.774 ms ┊ GC (median): 0.00%
Time (mean ± σ): 194.774 ms ± 202.764 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
195 ms Histogram: frequency by time 195 ms <
Memory estimate: 0 bytes, allocs estimate: 0.
BenchmarkTools.Trial: 2 samples with 1 evaluation.
Range (min … max): 14.728 ms … 16.522 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 15.625 ms ┊ GC (median): 0.00%
Time (mean ± σ): 15.625 ms ± 1.269 ms ┊ GC (mean ± σ): 0.00% ± 0.00%
█ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
14.7 ms Histogram: frequency by time 16.5 ms <
Memory estimate: 0 bytes, allocs estimate: 0.
The default julia is indeed slower in this case. I think the example in the documentation can be clarified better and the wrong argument A
should be fixed. It's not immediately obvious why you put view(Symmetric(A), :, :)'
, because it is essentially a no-op, but I understand that it intends to "fool" Julia.
Thanks for this! Would you be willing to make a PR updating the README?