Rotations.jl icon indicating copy to clipboard operation
Rotations.jl copied to clipboard

UnitQuaternions and ForwardDiff

Open janbruedigam opened this issue 4 years ago • 6 comments

When differentiating a function that takes a quaternion as an input, I get several different solutions for different setups. I know it's not really correct to treat a quaternion as a regular 4D vector when differentiating, but I'm a bit confused about the dimensions and entries of some of the results.

Code:

using LinearAlgebra
using ForwardDiff
using Rotations

function testfoo1(q)
    q = UnitQuaternion(q) 
    Rotations.params(q)
end

function testfoo2(q)
    q
end

q1 = rand(4)
q1 = q1/norm(q1)
q2 = UnitQuaternion(q1)

ForwardDiff.jacobian(testfoo1,q1) # 4x4 "random" matrix
ForwardDiff.jacobian(testfoo1,q2) # 4x9 "random" matrix
ForwardDiff.jacobian(testfoo2,q1) # 4x4 identity matrix
ForwardDiff.jacobian(testfoo2,q2) # 9x9 identity matrix

What I'd expect (but that's definitely debatable):

ForwardDiff.jacobian(testfoo1,q1) # 4x4 identity matrix
ForwardDiff.jacobian(testfoo1,q2) # 4x4 identity matrix or 4x3 "proper" quaternion derivative
ForwardDiff.jacobian(testfoo2,q1) # 4x4 identity matrix
ForwardDiff.jacobian(testfoo2,q2) # 4x4 identity matrix or 4x3 "proper" quaternion derivative

Especially the first one ( ForwardDiff.jacobian(testfoo1,q1) ) where I pass in a regular vector and return a regular vector seems off to me. And dimension 9 for some of the matrices also seems off; does that come from writing the quaternions as rotation matrices?

janbruedigam avatar Jun 07 '20 22:06 janbruedigam

This is a good question, hopefully my explanations below provide some insight:

ForwardDiff.jacobian(testfoo1,q1) # equivalent to ForwardDiff.jacobian(normalize, q1)
ForwardDiff.jacobian(testfoo1,q2) # 9-dimensional input since all 3D Rotations are `StaticMatrix{3,3}`, so a `UnitQuaternion` is not treated as a `StaticVector{4}`.
ForwardDiff.jacobian(testfoo2,q1) # 4x4 identity matrix
ForwardDiff.jacobian(testfoo2,q2) # Same reason as above, the `UnitQuaternion` devolves to a 3x3 `SMatrix{3,3}` when treated as a vector/matrix.

The first one is true since the constructor automatically renormalizes the inputs. You get the expected behavior with

function testfoo3(q)
    q = UnitQuaternion(q,false) 
    Rotations.params(q)
end

You'll notice this is done internally in many methods that construct a new UnitQuaternion.

Hopefully this explains the demonstrated behavior. As far as the expected behavior, we would need to add some custom auto-diff rules to handle UnitQuaternions in cases II and IV. We've talked about implementing this but haven't done it yet.

bjack205 avatar Jun 08 '20 20:06 bjack205

Ah yes, didn't think of the renormalization; testfoo3 works. I'll leave this open for now in case you want to implement a custom auto-diff rule, but feel free to close.

janbruedigam avatar Jun 09 '20 10:06 janbruedigam

I didn't look into the detail here, but would adding custom rules make this more or less consistent? I wouldn't want to add rules which are convenient but inconsistent with the other algebric properties of types in the package.

c42f avatar Jun 12 '20 04:06 c42f

I don't know about package consistency, but I'm ok with not having special UnitQuaternion ForwardDiff rules.

Just some thoughts from a user perspective: It could be irritating that a UnitQuaternion somehow has 9 parameters. For example, if I'm a new user, don't know about params, and naively want to convert a UnitQuaternion q to an SVector, SVector(q) gives my a 9-dimensional Array. That is explainable by the internal representation, but mathematically, a quaternion can uniquely represent an orientation (at least one-directional). So the seemingly arbitrary choice to choose 3x3 matrices as a representations might be confusing and unexpected to somebody not familiar with the package internals. Another example: rand(3,3)*q gives me a 3x3 (real valued) matrix, but rand(4,4)*q an error. I get where it's coming from internally, but it seems a bit inconsistent, at least to me.

janbruedigam avatar Jun 13 '20 11:06 janbruedigam

The algebraic perspective taken in this package is that a Rotation{N} is a linear map from ℝᴺ→ℝᴺ (and therefore a NN AbstractMatrix). So in this sense all rotation subtypes abstractly have NN elements which you can access with getindex, even though the underlying parameterizations have three or four.

If we use this abstraction we can't have SVector(q) do what you want because it's inconsistent: in other circumstances you can't convert a 3x3 matrix to a length-4 vector!

I don't really know whether Rotation{N} <: AbstractMatrix{N} is the best possible placement of Rotation within the type tree, perhaps someone else can comment on this :-) Regardless, it's correct and convenient if R is treated as a linear operator that R*v rotates the vector v via the rotation R.

c42f avatar Jun 16 '20 05:06 c42f

Another way to look at this algebraic problem would be to introduce a new function rotate distinct from Base.* which can take rotation parameterizations, such that rotate(r, v) rotates vector v by the rotation parameterization r::RP, where RP <: AbstractVector is considered as a plain old vector of parameters.

For example, have RotXYZParams{T} <: StaticVector{3,T}. In this case rotation parameterizations would not implement * at all.

A third algebraic possibility would be to have rotation parameterizations be functions so that r(v) rotates v by the rotation r. This would make them directly compatible with the algebra of CoordinateTransformations.jl, rather than needing to be wrapped in CoordinateTransformations.LinearMap. This could possibly the compatible with option two where we consider RP <: AbstractVector but I'm unsure.

I suspect there's no real "right" answer but each has their uses. @bjack205 and @andyferris might have some useful perspective on this.

c42f avatar Jun 16 '20 05:06 c42f