Kristoffer Carlsson
Kristoffer Carlsson
Updated. Some benchmarks made on https://github.com/JuliaLang/julia/pull/29258 ```jl using BenchmarkTools using StaticArrays x = rand(MMatrix{8,8}); s = rand(SMatrix{8,8}); ``` Before ```jl julia> @btime map!(x -> x*2, $x, $s); 11.056 ns (0...
Arguably the old methods should be left and we should `@static if` based on the VERSION... Thoughts?
```jl using BenchmarkTools using StaticArrays using LinearAlgebra for siz in (1,2,3,4,8) println("size = $siz x $siz") # Refs to avoid inlining into benchmark loop s = Ref(rand(SMatrix{siz, siz})) @btime sum(abs2,...
Removed the `map!` loop since it seems LLVM decided not to unroll the loop even though it perhaps was advantageous to do so.
> The escape analysis and codegen already work perfectly well, I’d just want conversions between mutable and immutable tuples to be no-ops when appropriate (as you want to store your...
Note that it is possible to write something like ```jl function matmul(a::SMatrix{I, J}, b::SMatrix{J, K}) where {I, J, K} c = zero(MMatrix{I, K}) @inbounds for k in 1:K, j in...
Not sure what the best thing is to do. FWIW, this just seems to be the SLP doing a bad job vectorizing the code, would be interesting to write the...
Seems clang kinda barfs on e.g. a 3x3 pattern as well: https://godbolt.org/z/TGCusg. It is interesting to note that when the sizes correspond to the width of the SIMD registers the...
I always saw not using `muladd` in StaticArrays as just a missed optimization. Are you saying it is an intended choice? Seems quite out of spirit with other choices made...
Apparently, LLVM has matrix multiplication intrinsics now: https://llvm.org/docs/LangRef.html#llvm-matrix-multiply-intrinsic