Mateusz Baran
Mateusz Baran
BTW, I think you should reduce the cost of multiplying by diagonal matrices (and probably also things like bidiagonal or tridiagonal) in that PR to base.
Yes, right, optimization of computation of products of two small matrices was complicated enough, and developing a reliable set of heuristics for products of more matrices would be even harder....
So maybe let's no worry too much now about reordering of multiplication for static matrices? Just eliminating run-time size check would definitely be enough. > if I try `@eval LinearAlgebra...
Improving that specific case wouldn't be too hard. Matrix multiplication already has fallbacks of different level of unrolling: https://github.com/JuliaArrays/StaticArrays.jl/blob/8ca11f871321c57cce2bfe3b9163317fa108f3c0/src/matrix_multiply.jl#L130 so very likely it would be enough to change `sa[1]` there...
You can try SnoopCompile to see where the difference comes from: ```julia using SnoopCompile SnoopCompile.@snoopc "/tmp/compiles_a.log" begin using StaticArrays map(==, [SVector(1:100...)], [SVector(1:100...)]); end SnoopCompile.@snoopc "/tmp/compiles_b.log" begin using StaticArrays SVector(1:100...) ==...
The time is almost entirely LLVM. On the Julia side the most expensive thing can be precompiled as: ```julia precompile(Tuple{typeof(Base.collect), Base.Generator{Base.Iterators.Zip{Tuple{Array{StaticArrays.SArray{Tuple{100}, Int64, 1, 100}, 1}, Array{StaticArrays.SArray{Tuple{100}, Int64, 1, 100}, 1}}},...
> who are examples of people in the LLVM community that may be interested in / capable of taking a closer look? I guess you could ask in the internals...
Yes, that's not good, I'll try to fix this.
Sure, returning `SMatrix` there is reasonable.
When you have more than 1000 elements in an array, then I doubt there is any function in StaticArrays.jl that would be actually faster than the plain Array implementation. Are...