ChainRules.jl
ChainRules.jl copied to clipboard
Assume commutative multiplication exactly when necessary
As noted in #504, there are a number of cases where types of rules were constrained to CommutativeMulNumber where commutation of multiplication did not need to be assumed. Likewise, there were places where commutativity was assumed but not enforced by a type constraint.
This PR fixes #504 by removing constraints where un-needed and adding others where needed. Because the trigonometric, hyperbolic, logarithmic, and exponential function rules all assume cummutativity, this puts constraints on a _large_number of rules. It's possible we don't want to do this, because there are certainly numeric types out there that are real (and therefore commutative) but do not subtype Real, and in this case these would not directly hit our rules. However, the approach taken here is much safer.
I will leave this to @mcabbott to review. and unsubscribe. ping me if i am needed
I propose we explicitly test rules with ::Number types using a non-commutative number. We could use Quaternions.Quaternion, but it lacks features like rand(::Quaternion) (see https://github.com/JuliaGeometry/Quaternions.jl/pull/42). We could pirate in the test suite to add this, or we could roll our own minimal quaternion implementation, like Julia base does (https://github.com/JuliaLang/julia/blob/bb5b98e72a151c41471d8cc14cacb495d647fb7f/test/testhelpers/Quaternions.jl). I think this would be the cleanest option, as if Quaternions adds explicit ChainRules support, that could mask potential issues.
Thoughts, @mcabbott?
Implementing them here (like Base) sounds fine, but is testing with Quaternions going to quadruple the time for tests to run? Would be nice to avoid that if possible.
This closes #275 presume?
Not deep, but I have notation comments:
- Can I vote for using LinearAlgebra's
RealOrComplexinstead of the more verboseCommutativeMulNumbereverywhere? They should be exactly equivalent, but the short one nobody will have to look up to check. - If touching so many lines, I wonder if we ought to standardise letters for sensitivities a little, if not perfectly. I'd like to suggest that
Δxbe reserved for forward mode, and maybe we should start writing∇xfor reverse, at least for arrays?
(To my eye, ẋ vs x̄ are usually too small to mark what's a pretty important difference, at least when muddled up in things with NoTangent and so on, big bold words.)
Implementing them here (like Base) sounds fine, but is testing with Quaternions going to quadruple the time for tests to run? Would be nice to avoid that if possible.
I think a bunch of scalar rules should have Quaternion tests, but probably only a few array rules, so I don't expect testing time to increase by that much (but we can check).
This closes #275 presume?
It supersedes #275 (forgot about that one; it's pretty stale now) and closes #504.
* Can I vote for using LinearAlgebra's `RealOrComplex` instead of the more verbose `CommutativeMulNumber` everywhere? They should be exactly equivalent, but the short one nobody will have to look up to check.
I think though that CommutativeMulNumber carries with it the semantic reason why we limit to Real and Complex, which is a safeguard against someone in the future accidentally changing it to be more or less restrictive.
* If touching so many lines, I wonder if we ought to standardise letters for sensitivities a little, if not perfectly. I'd like to suggest that `Δx` be reserved for forward mode, and maybe we should start writing `∇x` for reverse, at least for arrays?
(To my eye,
ẋvsx̄are usually too small to mark what's a pretty important difference, at least when muddled up in things with NoTangent and so on, big bold words.)
I agree, the dots and bars are not great (and they're unicode characters often missing from people's devices). I seem to recall at some point we expressly advised people to use Δ for both input tangents and cotangents and ∂ for output tangents and cotangents. I'm not opposed to ∇ for cotangents, but I'd prefer we discuss that and agree on it outside of this PR. IMO this PR shouldn't be changing notation at all.