StaticArrays.jl
StaticArrays.jl copied to clipboard
Improve 2x2 eigen
Using the fact that the eigenvectors must be orthonormal, v21 and v22, can be computed from v21 and v11. In addition to being more performant, this is actually more robust in some cases, for instance with [1.0 -9.465257061381386e-20; -9.465257061381386e-20 1.0], the previous code would return duplicated eigenvectors.
Coverage decreased (-0.04%) to 81.857% when pulling 54afe616d04536d4606c03e71ca403e17b16a3b7 on tfgast:master into 103e9d4dabfdb64ccb378c8539d15136dde964de on JuliaArrays:master.
Looks great to me, thanks!
Could you add a test or tests which cover the cases which the previous code got wrong? Is this characterized by the case where the off diagonal elements are less than eps(eltype(A))
?
Would this fix also allow us to remove the special case for where A
is detected to be diagonal? Just looking at the original code, the diagonal test seems unlikely to be numerically reliable... and ultimately the root cause of this bug.
I've improved the handling in the case of underflow as well. I wasn't able to remove the special case for diagonal matrices, as we still need to avoid dividing by zero. I've also added some more tests.
Where is this one at?
Also, how does this change as a whole impact performance?
Do we have an answer to this one?
I tested the performance impact of this just now: as far as I can see, there is a speed-up for Hermitian{ComplexF64, ...}
matrices but a slow-down for Hermitian{Float64, ...}
matrices (not completely clear why to me):
Hermitian{ComplexF64, ...} matrices
Master:
julia> @benchmark StaticArrays._eig(Size{(2,2)}(), A, true, true) setup=begin
a = @SMatrix rand(ComplexF64,2,2)
A = Hermitian(a + a')
end
BenchmarkTools.Trial: 10000 samples with 940 evaluations.
Range (min … max): 101.915 ns … 324.681 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 102.766 ns ┊ GC (median): 0.00%
Time (mean ± σ): 105.722 ns ± 9.921 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▅█▁ ▁▃▁ ▁▁ ▁
███▆▅▄▆▆▆▅▅▅▆███▇█████▇▇▇▆▅▅▅▅▅▅▄▅▅▅▄▅▄▄▅▄▅▄▄▄▅▅▄▄▄▅▅▅▄▄▄▄▃▄▃ █
102 ns Histogram: log(frequency) by time 149 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
PR: (speed-up)
julia> @benchmark StaticArrays._eig(Size{(2,2)}(), A, true, true) setup=begin
a = @SMatrix rand(ComplexF64,2,2)
A = Hermitian(a + a')
end
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
Range (min … max): 64.147 ns … 324.821 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 66.394 ns ┊ GC (median): 0.00%
Time (mean ± σ): 67.999 ns ± 7.887 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▄▆▆█▂ ▁▂▁▁ ▁ ▁
█████▆▄▅▃▆▅▆▆▆▄▆██████████▇▇▇▇▆▆▅▅▅▅▅▅▄▅▅▅▄▆▄▅▅▄▅▁▄▅▄▅▃▅▅▄▄▄ █
64.1 ns Histogram: log(frequency) by time 104 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
Hermitian{Float64, ...} matrices
Master:
julia> @benchmark StaticArrays._eig(Size{(2,2)}(), A, true, true) setup=begin
a = @SMatrix rand(Float64,2,2)
A = Hermitian(a + a')
end
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 8.909 ns … 69.670 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 9.009 ns ┊ GC (median): 0.00%
Time (mean ± σ): 9.230 ns ± 1.770 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█▄ ▁
██▆▄▄▄▄▁▃▄▅▄▃▃▃▄▁▁▁▄▅▇▇▆▅▄▅▄▄▃▄▄▅▄▄▄▁▃▃▁▃▃▁▁▁▃▁▁▁▁▁▃▁▄▃▃▄▅ █
8.91 ns Histogram: log(frequency) by time 18.9 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
PR: (slow-down)
julia> @benchmark StaticArrays._eig(Size{(2,2)}(), A, true, true) setup=begin
a = @SMatrix rand(Float64,2,2)
A = Hermitian(a + a')
end
BenchmarkTools.Trial: 10000 samples with 994 evaluations.
Range (min … max): 29.879 ns … 204.930 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 31.489 ns ┊ GC (median): 0.00%
Time (mean ± σ): 32.438 ns ± 5.469 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▆▃ █▃ ▁
███████▆▆▅▄▄▄▅▁▃▄▄▅▆▅▆▅▅▄▁▄▅▇█▇█▇▇▆▇▇▆▆▇▅▇▅▇▇▅▅▅▆▆▆█▇▇▆▅▆▆▅▆ █
29.9 ns Histogram: log(frequency) by time 51.6 ns <
Memory estimate: 0 bytes, allocs estimate: 0.