Rotations.jl
Rotations.jl copied to clipboard
UnitQuaternions and ForwardDiff
When differentiating a function that takes a quaternion as an input, I get several different solutions for different setups. I know it's not really correct to treat a quaternion as a regular 4D vector when differentiating, but I'm a bit confused about the dimensions and entries of some of the results.
Code:
using LinearAlgebra
using ForwardDiff
using Rotations
function testfoo1(q)
q = UnitQuaternion(q)
Rotations.params(q)
end
function testfoo2(q)
q
end
q1 = rand(4)
q1 = q1/norm(q1)
q2 = UnitQuaternion(q1)
ForwardDiff.jacobian(testfoo1,q1) # 4x4 "random" matrix
ForwardDiff.jacobian(testfoo1,q2) # 4x9 "random" matrix
ForwardDiff.jacobian(testfoo2,q1) # 4x4 identity matrix
ForwardDiff.jacobian(testfoo2,q2) # 9x9 identity matrix
What I'd expect (but that's definitely debatable):
ForwardDiff.jacobian(testfoo1,q1) # 4x4 identity matrix
ForwardDiff.jacobian(testfoo1,q2) # 4x4 identity matrix or 4x3 "proper" quaternion derivative
ForwardDiff.jacobian(testfoo2,q1) # 4x4 identity matrix
ForwardDiff.jacobian(testfoo2,q2) # 4x4 identity matrix or 4x3 "proper" quaternion derivative
Especially the first one ( ForwardDiff.jacobian(testfoo1,q1)
) where I pass in a regular vector and return a regular vector seems off to me. And dimension 9 for some of the matrices also seems off; does that come from writing the quaternions as rotation matrices?
This is a good question, hopefully my explanations below provide some insight:
ForwardDiff.jacobian(testfoo1,q1) # equivalent to ForwardDiff.jacobian(normalize, q1)
ForwardDiff.jacobian(testfoo1,q2) # 9-dimensional input since all 3D Rotations are `StaticMatrix{3,3}`, so a `UnitQuaternion` is not treated as a `StaticVector{4}`.
ForwardDiff.jacobian(testfoo2,q1) # 4x4 identity matrix
ForwardDiff.jacobian(testfoo2,q2) # Same reason as above, the `UnitQuaternion` devolves to a 3x3 `SMatrix{3,3}` when treated as a vector/matrix.
The first one is true since the constructor automatically renormalizes the inputs. You get the expected behavior with
function testfoo3(q)
q = UnitQuaternion(q,false)
Rotations.params(q)
end
You'll notice this is done internally in many methods that construct a new UnitQuaternion
.
Hopefully this explains the demonstrated behavior. As far as the expected behavior, we would need to add some custom auto-diff rules to handle UnitQuaternions
in cases II and IV. We've talked about implementing this but haven't done it yet.
Ah yes, didn't think of the renormalization; testfoo3
works.
I'll leave this open for now in case you want to implement a custom auto-diff rule, but feel free to close.
I didn't look into the detail here, but would adding custom rules make this more or less consistent? I wouldn't want to add rules which are convenient but inconsistent with the other algebric properties of types in the package.
I don't know about package consistency, but I'm ok with not having special UnitQuaternion
ForwardDiff rules.
Just some thoughts from a user perspective: It could be irritating that a UnitQuaternion
somehow has 9 parameters. For example, if I'm a new user, don't know about params
, and naively want to convert a UnitQuaternion
q
to an SVector
, SVector(q)
gives my a 9-dimensional Array.
That is explainable by the internal representation, but mathematically, a quaternion can uniquely
represent an orientation (at least one-directional). So the seemingly arbitrary choice to choose 3x3 matrices as a representations might be confusing and unexpected to somebody not familiar with the package internals.
Another example: rand(3,3)*q
gives me a 3x3 (real valued) matrix, but rand(4,4)*q
an error. I get where it's coming from internally, but it seems a bit inconsistent, at least to me.
The algebraic perspective taken in this package is that a Rotation{N}
is a linear map from ℝᴺ→ℝᴺ (and therefore a NN AbstractMatrix
). So in this sense all rotation subtypes abstractly have NN elements which you can access with getindex
, even though the underlying parameterizations have three or four.
If we use this abstraction we can't have SVector(q)
do what you want because it's inconsistent: in other circumstances you can't convert a 3x3 matrix to a length-4 vector!
I don't really know whether Rotation{N} <: AbstractMatrix{N}
is the best possible placement of Rotation within the type tree, perhaps someone else can comment on this :-) Regardless, it's correct and convenient if R
is treated as a linear operator that R*v
rotates the vector v
via the rotation R
.
Another way to look at this algebraic problem would be to introduce a new function rotate
distinct from Base.*
which can take rotation parameterizations, such that rotate(r, v)
rotates vector v
by the rotation parameterization r::RP
, where RP <: AbstractVector
is considered as a plain old vector of parameters.
For example, have RotXYZParams{T} <: StaticVector{3,T}
. In this case rotation parameterizations would not implement *
at all.
A third algebraic possibility would be to have rotation parameterizations be functions so that r(v)
rotates v
by the rotation r
. This would make them directly compatible with the algebra of CoordinateTransformations.jl, rather than needing to be wrapped in CoordinateTransformations.LinearMap
. This could possibly the compatible with option two where we consider RP <: AbstractVector
but I'm unsure.
I suspect there's no real "right" answer but each has their uses. @bjack205 and @andyferris might have some useful perspective on this.