StaticArrays.jl
StaticArrays.jl copied to clipboard
Accuracy loss in `normalize`/`cross`
Hi!
I've noticed a small loss in accuracy when computing mat * vec:
Without StaticArrays:
julia> lookMat = Float32[ -0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0]
4×4 Matrix{Float32}:
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> lookMat * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
0.0
-2.9154742
1.0
With StaticArrays loaded (no matter whether it's Float32 or Float64, the accuracy loss is the same - note the nonzero y component):
julia> Array(lookMat)
4×4 Matrix{Float64}:
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> Array(lookMat) * [0.5, 0.5, 0, 1]
4-element Vector{Float64}:
0.0
2.9802322387695312e-8
-2.915476143360138
1.0
The matrix is generated by this function:
function lookAt(eye, target, up)
# up is local up of the camera
zaxis = -normalize(target - eye)
xaxis = normalize(cross(up, zaxis))
yaxis = cross(zaxis, xaxis)
res = @SMatrix Float32[
xaxis[1] xaxis[2] xaxis[3] -dot(eye, xaxis);
yaxis[1] yaxis[2] yaxis[3] -dot(eye, yaxis);
zaxis[1] zaxis[2] zaxis[3] -dot(eye, zaxis);
0 0 0 1
]
return res
end
What this does, in essence, is orient a coordinate system with origin at (2,2,2) to point at target in the -z direction. So lookMat * target should be exactly a vector in -z direction.
Accuracy loss of the StaticArray version aside, is it expected that this also happens for the regular Array multiplication?
Any change of regular matrix-vector multiplication after loading StaticArrays.jl is very much unexpected and undesirable. I couldn't reproduce your results though:
julia> Array(lookMat) * [0.5, 0.5, 0, 1]
4-element Vector{Float64}:
0.0
0.0
-2.915473997592926
1.0
Could you add a full example, starting from a fresh Julia session?
Sure:
Expand for session
[sukera@tempman vulkanAdventures]$ julia -q
(@v1.9) pkg> activate --temp
Activating new project at `/tmp/jl_d003TL`
(jl_d003TL) pkg> add StaticArrays
Updating registry at `~/.julia/registries/General.toml`
Resolving package versions...
Updating `/tmp/jl_d003TL/Project.toml`
[90137ffa] + StaticArrays v1.5.0
Updating `/tmp/jl_d003TL/Manifest.toml`
[90137ffa] + StaticArrays v1.5.0
[1e83bf80] + StaticArraysCore v1.0.1
[56f22d72] + Artifacts
[8f399da3] + Libdl
[37e2e46d] + LinearAlgebra
[9a3f8284] + Random
[ea8e919c] + SHA v0.7.0
[9e88b42a] + Serialization
[2f01184e] + SparseArrays
[10745b16] + Statistics
[e66e0078] + CompilerSupportLibraries_jll v0.5.2+0
[4536629a] + OpenBLAS_jll v0.3.20+0
[8e850b90] + libblastrampoline_jll v5.1.0+0
julia> using StaticArrays
julia> function lookAt(eye, target, up)
# up is local up of the camera
zaxis = -normalize(target - eye)
xaxis = normalize(cross(up, zaxis))
yaxis = cross(zaxis, xaxis)
res = @SMatrix Float32[
xaxis[1] xaxis[2] xaxis[3] -dot(eye, xaxis);
yaxis[1] yaxis[2] yaxis[3] -dot(eye, yaxis);
zaxis[1] zaxis[2] zaxis[3] -dot(eye, zaxis);
0 0 0 1
]
return res
end
lookAt (generic function with 1 method)
julia> const Vec3 = SVector{3, Float32}
SVector{3, Float32} (alias for SArray{Tuple{3}, Float32, 1, 3})
julia> using LinearAlgebra
julia> lookMat = lookAt(Vec3(2,2,2), Vec3(0.5, 0.5, 0), Vec3(0,0,1))
4×4 SMatrix{4, 4, Float32, 16} with indices SOneTo(4)×SOneTo(4):
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> const Vec4 = SVector{4, Float32}
SVector{4, Float32} (alias for SArray{Tuple{4}, Float32, 1, 4})
julia> lookMat * Vec4(0.5, 0.5, 0, 1)
4-element SVector{4, Float32} with indices SOneTo(4):
0.0
2.9802322f-8
-2.915476
1.0
julia> Array(lookMat) * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
2.9802322f-8
-2.915476
1.0
julia> Array(lookMat) * [0.5, 0.5, 0, 1]
4-element Vector{Float64}:
0.0
2.9802322387695312e-8
-2.915476143360138
1.0
And running pure Array in a fresh session:
[sukera@tempman vulkanAdventures]$ julia -q
julia> using LinearAlgebra
julia> function lookAt(eye, target, up)
# up is local up of the camera
zaxis = -normalize(target - eye)
xaxis = normalize(cross(up, zaxis))
yaxis = cross(zaxis, xaxis)
res = Float32[
xaxis[1] xaxis[2] xaxis[3] -dot(eye, xaxis);
yaxis[1] yaxis[2] yaxis[3] -dot(eye, yaxis);
zaxis[1] zaxis[2] zaxis[3] -dot(eye, zaxis);
0 0 0 1
]
return res
end
lookAt (generic function with 1 method)
julia> lookMat = lookAt([2,2,2], [0.5, 0.5, 0], [0,0,1])
4×4 Matrix{Float32}:
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> lookMat * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
0.0
-2.9154758
1.0
Notably, doing using StaticArrays after already having done a matmul does not affect the accuracy.
I'm not 100% sure but to me it looks like the loss of accuracy is in one of the conversions
julia> repr(lookMat) # from lookAt
"Float32[-0.70710677 0.70710677 0.0 -0.0; -0.48507127 -0.48507127 0.7276069 0.4850713; 0.5144958 0.5144958 0.6859944 -3.429972; 0.0 0.0 0.0 1.0]"
# fresh session
# lookMat copied from `repr` above
julia> lookMat = Float32[-0.70710677 0.70710677 0.0 -0.0; -0.48507127 -0.48507127 0.7276069 0.4850713; 0.5144958 0.5144958 0.6859944 -3.429972; 0.0 0.0 0.0 1.0]
4×4 Matrix{Float32}:
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> lookMat * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
2.9802322f-8
-2.915476
1.0
julia> lookMat * [0.5, 0.5, 0, 1]
4-element Vector{Float64}:
0.0
2.9802322387695312e-8
-2.915476143360138
1.0
Well I did not copy in the reproducer above, see the expanding thingy part labelled Expand for session. I just didn't put the pure Array one into that as well, since it was so short.
Perhaps it has to do with my CPU/julia/LLVM version?
julia> versioninfo()
Julia Version 1.9.0-DEV.724
Commit 95be1a4797* (2022-06-12 06:35 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: 4 × Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
Threads: 4 on 4 virtual cores
Environment:
JULIA_NUM_THREADS = 4
The arrays are not the same:
julia> lookMat_Array - lookMat_SA
4×4 SMatrix{4, 4, Float32, 16} with indices SOneTo(4)×SOneTo(4):
0.0 0.0 0.0 0.0
2.98023f-8 2.98023f-8 0.0 -5.96046f-8
-5.96046f-8 -5.96046f-8 -5.96046f-8 2.38419f-7
0.0 0.0 0.0 0.0
probably due to difference in dot, cross, normalize between StaticArrays and regular arrays.
That doesn't really explain it when I explicitly make them the same:
[sukera@tempman vulkanAdventures]$ julia -q
(@v1.9) pkg> activate --temp
Activating new project at `/tmp/jl_bhGm2a`
(jl_bhGm2a) pkg> add StaticArrays
Updating registry at `~/.julia/registries/General.toml`
Resolving package versions...
Updating `/tmp/jl_bhGm2a/Project.toml`
[...]
julia> using StaticArrays, LinearAlgebra
julia> function lookAt(eye, target, up)
# up is local up of the camera
zaxis = -normalize(target - eye)
xaxis = normalize(cross(up, zaxis))
yaxis = cross(zaxis, xaxis)
res = @SMatrix Float32[
xaxis[1] xaxis[2] xaxis[3] -dot(eye, xaxis);
yaxis[1] yaxis[2] yaxis[3] -dot(eye, yaxis);
zaxis[1] zaxis[2] zaxis[3] -dot(eye, zaxis);
0 0 0 1
]
return res
end
lookAt (generic function with 1 method)
julia> const Vec3 = SVector{3, Float32}
SVector{3, Float32} (alias for SArray{Tuple{3}, Float32, 1, 3})
julia> lookMat = lookAt(Vec3(2,2,2), Vec3(0.5, 0.5, 0), Vec3(0,0,1))
4×4 SMatrix{4, 4, Float32, 16} with indices SOneTo(4)×SOneTo(4):
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> lookMat - Array(lookMat) # same for the reverse..
4×4 SMatrix{4, 4, Float32, 16} with indices SOneTo(4)×SOneTo(4):
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0
julia> Array(lookMat) * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
2.9802322f-8
-2.915476
1.0
julia> Array(lookMat)
4×4 Matrix{Float32}:
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
That the operation works when I don't load StaticArrays (and thus don't have access to @SMatrix) is exactly what I'm confused about. Behavior for Array shouldn't change, no matter where the numbers come from, right?
The issue is that lookAt that works on Array and lookAt that works on SArray don't compute the exact same thing:
julia> using StaticArrays, LinearAlgebra
julia> const Vec3 = SVector{3, Float32}
SVector{3, Float32} (alias for SArray{Tuple{3}, Float32, 1, 3})
julia> function lookAt(eye, target, up)
# up is local up of the camera
zaxis = -normalize(target - eye)
xaxis = normalize(cross(up, zaxis))
yaxis = cross(zaxis, xaxis)
res = Float32[
xaxis[1] xaxis[2] xaxis[3] -dot(eye, xaxis);
yaxis[1] yaxis[2] yaxis[3] -dot(eye, yaxis);
zaxis[1] zaxis[2] zaxis[3] -dot(eye, zaxis);
0 0 0 1
]
return res
end
lookAt (generic function with 1 method)
julia> lookMatS = lookAt(Vec3(2,2,2), Vec3(0.5, 0.5, 0), Vec3(0,0,1))
4×4 Matrix{Float32}:
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> lookMat = lookAt([2,2,2], [0.5, 0.5, 0], [0,0,1])
4×4 Matrix{Float32}:
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> lookMat - lookMatS
4×4 Matrix{Float32}:
0.0 0.0 0.0 0.0
2.98023f-8 2.98023f-8 0.0 -5.96046f-8
-5.96046f-8 -5.96046f-8 -5.96046f-8 2.38419f-7
0.0 0.0 0.0 0.0
It is unrelated to multiplication.
You're right!
julia> bitstring.(lookMat) .== bitstring.(lookMatS)
4×4 BitMatrix:
1 1 1 1
0 0 1 0
0 0 0 0
1 1 1 1
that seems.. not good?
xaxis is the same, so the problem shouldn't be (exclusively at least) with normalize. That only leaves cross for introducing a difference? I guess this in part comes down to "should there be a difference at all, or are errors on this scale ok"?
Here's some information about what the differences in bitpattern are. Seems like there are some differences in the LSB of the mantissa.
Information about differences
julia> diffs = filter(x -> any(!iszero, x), map.(!=, bitstring.(lookMat), bitstring.(lookMatS)))
7-element Vector{Vector{Bool}}:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 1, 1, 1]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 1, 1, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0 … 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
julia> map(s -> join(map(Int, s)), diffs)
7-element Vector{String}:
"00000000000000000000000000000001"
"00000000000000000000000000000001"
"00000000000000000000000000000001"
"00000000000000000000000000000001"
"00000000000000000000000000000111"
"00000000000000000000000000000110"
"00000000000000000000000000000001"
julia> idxs = findall(x -> any(!iszero, x), map.(!=, bitstring.(lookMat), bitstring.(lookMatS)))
7-element Vector{CartesianIndex{2}}:
CartesianIndex(2, 1)
CartesianIndex(3, 1)
CartesianIndex(2, 2)
CartesianIndex(3, 2)
CartesianIndex(3, 3)
CartesianIndex(2, 4)
CartesianIndex(3, 4)
julia> bitstring.(lookMat[idxs])
7-element Vector{String}:
"10111110111110000101101101000010"
"00111111000000111011010111111110"
"10111110111110000101101101000010"
"00111111000000111011010111111110"
"00111111001011111001110101010011"
"00111110111110000101101101000010"
"11000000010110111000010010101000"
julia> lookMat[idxs]
7-element Vector{Float32}:
-0.48507124
0.51449573
-0.48507124
0.51449573
0.6859943
0.48507124
-3.4299717
julia> lookMatS[idxs]
7-element Vector{Float32}:
-0.48507127
0.5144958
-0.48507127
0.5144958
0.6859944
0.4850713
-3.429972
Now I see that the difference is even earlier. Here: lookMatS = lookAt(Vec3(2,2,2), Vec3(0.5, 0.5, 0), Vec3(0,0,1)) we work on Float32 input, while here: lookMat = lookAt([2,2,2], [0.5, 0.5, 0], [0,0,1]) it's Float64. When you convert to the same type (either Float32 or Float64), the difference disappears.
The difference does, but the wrong result does not:
[sukera@tempman ~]$ julia -q
julia> using LinearAlgebra
julia> function lookAt(eye, target, up)
# up is local up of the camera
zaxis = -normalize(target - eye)
xaxis = normalize(cross(up, zaxis))
yaxis = cross(zaxis, xaxis)
res = Float32[
xaxis[1] xaxis[2] xaxis[3] -dot(eye, xaxis);
yaxis[1] yaxis[2] yaxis[3] -dot(eye, yaxis);
zaxis[1] zaxis[2] zaxis[3] -dot(eye, zaxis);
0 0 0 1
]
return res
end
lookAt (generic function with 1 method)
julia> lookMat = lookAt(Float32[2,2,2], Float32[0.5, 0.5, 0], Float32[0,0,1])
4×4 Matrix{Float32}:
-0.707107 0.707107 0.0 -0.0
-0.485071 -0.485071 0.727607 0.485071
0.514496 0.514496 0.685994 -3.42997
0.0 0.0 0.0 1.0
julia> lookMat * (Float32[0.5, 0.5, 0, 1])
4-element Vector{Float32}:
0.0
2.9802322f-8
-2.915476
1.0
So I guess this is an inaccuracy in LinearAlgebra in either cross or normalize (or both)?
Good question -- I'm not even sure if you can expect that code (lookAt) to be more accurate.
Well the 64 bit version is accurate, if I'm not mistaken, according to our investigation here :| Would be interesting to know where the inaccuracy comes from.
As best I can tell, the issue here was that (lookMat * (Float32[0.5, 0.5, 0, 1]))[2] == 2.9802322f-8 rather than == 0.0f0? If not, I misunderstand this issue and please disregard everything that follows.
This looks like a simple case of floating point roundoff error to me? Notice that lookMat[2,1] + lookMat[2,4] == 2.9802322f-8 == eps(lookMat[2,1]).
From the first line, in infinite precision we have norm(target - eye) == sqrt(17//2) leading to zaxis == [-sqrt(9//34), -sqrt(9//34), -sqrt(16//34)]. These numbers aren't even rational, much less to they have an exact representation in Float32 (or Float64 or BigFloat). Rounding is necessary and roundoff error ensues and propagates.
You can get more accuracy with a longer floating point format, but any float will see roundoff errors propagate when unrepresentable numbers (e.g., any irrational) are involved. Exact results require symbolic math. The fact that Float64 arithmetic gave a residual of 0.0e0 here is lucky rounding.
Speaking of luck, I observe that we see the desired residual 0.0f0 in this specific example when we make the substitution
# xaxis = normalize(cross(up, zaxis))
xaxis = normalize(cross(up, eye - target))
In general, it's slightly better to defer normalization to avoid the extra roundoff it introduces. Since we normalize the result of the cross here anyway, this is one such place. normalize surrenders additional accuracy by using x * inv(norm(x)) rather than x / norm(x), which involves 2 rounding operations (beyond norm) rather than just 1 (but is used because it's faster if the array has more than a few elements). This leads to "exciting" results like normalize([49.0]) == [0.9999999999999999].
Yes, the underlying issue was that roundoff, but the larger discussion is why the code produced a different result when comparing Array vs. StaticArray, which shouldn't be the case I don't think. The operations done ought to all be the same after all.
I thought that was resolved earlier to be the difference between Float32 and Float64 arithmetic?
julia> lookAt(SVector{3,Float32}(2,2,2), SVector{3,Float32}(0.5, 0.5, 0), SVector{3,Float32}(0,0,1)) * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
2.9802322f-8
-2.915476
1.0
julia> lookAt(SVector{3,Float64}(2,2,2), SVector{3,Float64}(0.5, 0.5, 0), SVector{3,Float64}(0,0,1)) * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
0.0
-2.9154758
1.0
julia> lookAt(Float32[2,2,2], Float32[0.5, 0.5, 0], Float32[0,0,1]) * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
2.9802322f-8
-2.915476
1.0
julia> lookAt(Float64[2,2,2], Float64[0.5, 0.5, 0], Float64[0,0,1]) * Float32[0.5, 0.5, 0, 1]
4-element Vector{Float32}:
0.0
0.0
-2.9154758
1.0
That said, I wouldn't generally insist that all SArray operations match bit-for-bit with Array operations. StaticArrays can't match both OpenBLAS and MKL (unless it relegates all relevant operations to whatever backend is loaded, but then then we lose all the performance of StaticArrays). StaticArrays should be free to re-associate certain operations as it thinks best. That can lead to slightly differing results, but neither computation would be universally "more correct" than the other.