ForwardDiff.jl
ForwardDiff.jl copied to clipboard
Create lightweight package AbstractDualNumbers or ForwardDiffBase or similar?
My apologies if this has been suggested before:
While ForwardDiff is not the heaviest of packages in the ecosystem, it's also not exactly lightweight (take 1.6 seconds to load on my system). A lightweight package AbstractDualNumbers.jl or ForwardDiffBase.jl (or similar) that just defines something like abstract type AbstractDualNumber{Tag} <: Real end and things like function AbstractDualNumbers.value end and function AbstractDualNumbers.partials end could allow packages to define custom push-forwards without depending on ForwardDiff itself.
I know that there are exciting efforts underway in the Julia-AD-ecosystem for new ADs (e.g. Diffractor), but ForwardDiff is certainly not going away any time soon. A really lightweight way to define push-forwards could reduce the frequency of @require ForwardDiff in the ecosystem quite a bit, and also make it possible to move code from packages like DistributionsAD to Distributions, etc.
Note we already have DualNumbers.jl. I believe the plan is to move ForwardDiff.Dual to there. (Unless there's a need for un-tagged dual numbers...)
x-ref https://github.com/JuliaDiff/DualNumbers.jl/issues/45
JuliaDiff/DualNumbers.jl#45 would be nice ...
I think #45 is definitely the way to go, but ForwardDiff.Dual definitely needs some work to make it easy to use (beginning with pretty printing).
i was looking at https://github.com/JuliaDiff/DualNumbers.jl/issues/45, https://github.com/JuliaDiff/DualNumbers.jl/pull/49 and the source code of DualNumbers.jl, and for hyphotetically embarking on such migration (lets call it FD DualNumber), i have some questions (and observations):
- It is necessary for FD DualNumbers to support SpecialFunctions, NaNMath or Calculus ? if FD DualNumbers is made to only support base and expose an overload API, an alternative path would be to make those packages define their own derivatives.
- On the other part, SpecialFunctions already loads ChainRulesCore. if there is a possibility of any integration between the two packages, that integration could be done at the FD DualNumbers level
- On the pretty printing, there was an attempt to improve the printing https://github.com/JuliaDiff/ForwardDiff.jl/pull/193, but it was not merged for some valid reasons
- I suppose that https://github.com/JuliaDiff/DualNumbers.jl/pull/49 is stale, but the migration guide provided @jrevels by is really useful:
Before you go down this route, be warned that it will probably involve a good bit of work and digging into the implementation details of both packages. I had planned on doing this work myself after v0.6 released to avoid having to continuously update with breakage/depwarn fixes.
We need the change-over to not merely swap out the old implementation for ForwardDiff's, but to ensure that the feature sets of the two implementations are appropriately merged. We'll wish to drop some of the old behaviors, while other behaviors we'll wish to preserve, probably requiring new definitions. For example, there are some primitives defined on
DualNumbers.Dualthat are not yet defined onForwardDiff.Dual. There might be more subtle behavioral changes as well.Things that have to be done (besides just porting over the code):
- [ ] Implement a deprecation layer
- [ ] Implement whatever new functionality we need to appropriately merge the behavior of the two implementations
- [ ] Write new tests for any new definitions
- [ ] Write documentation describing the new interface
I'd also like my name added to the LICENSE (and I believe @mlubin is also within his rights to request this, but I'll let him speak for himself). I believe doing this requires Theo's permission?
Are there any additional things to be done apart from the list above?
Most of the load time of ForwardDiff is actually due to StaticArrays - that is, I think, only used for the Hessian, Jabobian, etc. functionality, so a package focused on dual-numbers should load very quickly.
It is necessary for FD DualNumbers to support SpecialFunctions, NaNMath or Calculus
I think if it's lightweight enough there would be a chance to convince SpecialFunctions, NaNMath, etc. to support it, instead of the other way round.
On the other part, SpecialFunctions already loads ChainRulesCore
Supporting ChainRulesCore would open so many doors. StatsFuns, for example, defines a lof of ChainRulesCore.@scalar_rules, but there are pretty much unusable at the moment because ForwardDiff doesn't utilize them.
One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.
Looking at DualNumbers.jl direct dependencies on github, not all of those have a dependency in their latest version:
- AIBECS; Does not depent on DualNumbers.jl in their latest version
- ApproxFun; https://github.com/JuliaApproximation/ApproxFun.jl/blob/master/src/Extras/dualnumbers.jl . see also https://github.com/JuliaApproximation/ApproxFun.jl/issues/764
- ApproxFunBase: https://github.com/JuliaApproximation/ApproxFunBase.jl/blob/ffad968dd10ac28477cec04d28036ffdbd879752/src/LinearAlgebra/helper.jl#L83-L84
- ApproxFunFourier: Does not depend on DualNumbers.jl in their latest version
- ApproxFunOrthogonalPolynomials: Does not depend on DualNumbers.jl in their latest version
- ApproxFunSingularities: Does not depend on DualNumbers.jl in their latest version
- DualMatrixTools: https://github.com/briochemc/DualMatrixTools.jl/blob/master/src/DualMatrixTools.jl
- F1Method: : Does not depend on DualNumbers.jl in their latest version
- HypergeometricFunctions: it depends, in multiple files https://github.com/JuliaMath/HypergeometricFunctions.jl
- Interpolations: Does not depend on DualNumbers.jl in their latest version
- InvariantMeasures: https://github.com/orkolorko/InvariantMeasures.jl/blob/ebe81b98efb229eca19c189eff0a203562ed5a47/src/NewChebyshev.jl#L221-L225 (is this legal)
- Nabla: Does not depend on DualNumbers.jl in their latest version
- Poltergeist: it depends https://github.com/wormell/Poltergeist.jl (last update was 2 years ago)
- Quaternions: https://github.com/JuliaGeometry/Quaternions.jl/blob/master/src/DualQuaternion.jl
- RiemannHilbert: https://github.com/JuliaHolomorphic/RiemannHilbert.jl/blob/master/src/LogNumber.jl
- SALTBase: https://github.com/wsshin/SALTBase.jl/commit/42faec4f2949f1bf5cbb7c133c0b9aee8ba7e2f4 (it used FD before and switched to DualNumbers)
- SingularIntegralEquations: https://github.com/JuliaApproximation/SingularIntegralEquations.jl/blob/d4f5caeb37a76e58b131575897955ce1c29d5f36/src/Extras/normalderivative.jl
One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.
Coming from you, that's almost an endorsement @mcabbott :-)
Maybe that's not that crazy at all? We don't want ChainRulesCore to become noticeably heavier, of course, now that it's making real inroads throughout the ecosystem - but maybe the cost wouldn't be high? We're currently at (Julia v1.8.0-beta3)
julia> @time_imports using ChainRulesCore
3.1 ms ┌ Compat
58.1 ms ChainRulesCore
If it's just 5 ms more or so, maybe that would be Ok? DualNumbers are quite fundamental after all - or at least will be once there's only one version of them around.
One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore [...] If it's just 5 ms more or so, maybe that would be Ok?
The package load times do suggest a certain graph of package dependencies (run in sequence in a single session):
julia> @time_imports using ChainRulesCore
3.2 ms ┌ Compat
63.2 ms ChainRulesCore
julia> @time_imports using Calculus
3.3 ms Calculus
julia> @time_imports using NaNMath
1.6 ms NaNMath
julia> @time_imports using SpecialFunctions
0.9 ms ┌ ChangesOfVariables
0.3 ms ┌ OpenLibm_jll
3.0 ms ┌ DocStringExtensions
4.1 ms ┌ IrrationalConstants
0.6 ms ┌ CompilerSupportLibraries_jll
1.4 ms ┌ LogExpFunctions
17.4 ms ┌ Preferences
18.0 ms ┌ JLLWrappers
21.4 ms ┌ OpenSpecFun_jll
131.1 ms SpecialFunctions
julia> @time_imports using DualNumbers
13.7 ms DualNumbers
Especially SpecialFunctions should clearly depend on a dual-numbers package and not the other way round. :-) And ChainRulesCore depending on dual-numbers would seem quite natural as well. And the potential benefits would be huge - we would quickly get a lot more dual-numbers/ForwardDiff-support throughout the ecosystem (especially in the statistics sector - DistributionsAD could just go away completely - but also in many other domains).
DistributionsAD could just go away completely
ForwardDiff is not the main blocker, it's Tracker and ReverseDiff. There are only very few definitions for dual numbers remaining: https://github.com/TuringLang/DistributionsAD.jl/blob/master/src/forwarddiff.jl
DistributionsAD could just go away completely ForwardDiff is not the main blocker, it's Tracker and ReverseDiff.
Ah, sorry, you're right of course. (Full) ChainRulesCore-support in Tracker and ReverseDiff would be so nice ...
Speaking of the statistics domain there's StatsFuns, though, with several @scalar_rule's that could make the respective functions ForwardDiff-compatible.