ForwardDiff.jl icon indicating copy to clipboard operation
ForwardDiff.jl copied to clipboard

Create lightweight package AbstractDualNumbers or ForwardDiffBase or similar?

Open oschulz opened this issue 4 years ago • 14 comments

My apologies if this has been suggested before:

While ForwardDiff is not the heaviest of packages in the ecosystem, it's also not exactly lightweight (take 1.6 seconds to load on my system). A lightweight package AbstractDualNumbers.jl or ForwardDiffBase.jl (or similar) that just defines something like abstract type AbstractDualNumber{Tag} <: Real end and things like function AbstractDualNumbers.value end and function AbstractDualNumbers.partials end could allow packages to define custom push-forwards without depending on ForwardDiff itself.

I know that there are exciting efforts underway in the Julia-AD-ecosystem for new ADs (e.g. Diffractor), but ForwardDiff is certainly not going away any time soon. A really lightweight way to define push-forwards could reduce the frequency of @require ForwardDiff in the ecosystem quite a bit, and also make it possible to move code from packages like DistributionsAD to Distributions, etc.

oschulz avatar May 01 '21 10:05 oschulz

Note we already have DualNumbers.jl. I believe the plan is to move ForwardDiff.Dual to there. (Unless there's a need for un-tagged dual numbers...)

dlfivefifty avatar Aug 31 '21 19:08 dlfivefifty

x-ref https://github.com/JuliaDiff/DualNumbers.jl/issues/45

hyrodium avatar Apr 16 '22 04:04 hyrodium

JuliaDiff/DualNumbers.jl#45 would be nice ...

oschulz avatar Apr 16 '22 18:04 oschulz

I think #45 is definitely the way to go, but ForwardDiff.Dual definitely needs some work to make it easy to use (beginning with pretty printing).

dlfivefifty avatar Apr 16 '22 19:04 dlfivefifty

i was looking at https://github.com/JuliaDiff/DualNumbers.jl/issues/45, https://github.com/JuliaDiff/DualNumbers.jl/pull/49 and the source code of DualNumbers.jl, and for hyphotetically embarking on such migration (lets call it FD DualNumber), i have some questions (and observations):

  • It is necessary for FD DualNumbers to support SpecialFunctions, NaNMath or Calculus ? if FD DualNumbers is made to only support base and expose an overload API, an alternative path would be to make those packages define their own derivatives.
  • On the other part, SpecialFunctions already loads ChainRulesCore. if there is a possibility of any integration between the two packages, that integration could be done at the FD DualNumbers level
  • On the pretty printing, there was an attempt to improve the printing https://github.com/JuliaDiff/ForwardDiff.jl/pull/193, but it was not merged for some valid reasons
  • I suppose that https://github.com/JuliaDiff/DualNumbers.jl/pull/49 is stale, but the migration guide provided @jrevels by is really useful:

Before you go down this route, be warned that it will probably involve a good bit of work and digging into the implementation details of both packages. I had planned on doing this work myself after v0.6 released to avoid having to continuously update with breakage/depwarn fixes.

We need the change-over to not merely swap out the old implementation for ForwardDiff's, but to ensure that the feature sets of the two implementations are appropriately merged. We'll wish to drop some of the old behaviors, while other behaviors we'll wish to preserve, probably requiring new definitions. For example, there are some primitives defined on DualNumbers.Dual that are not yet defined on ForwardDiff.Dual. There might be more subtle behavioral changes as well.

Things that have to be done (besides just porting over the code):

  • [ ] Implement a deprecation layer
  • [ ] Implement whatever new functionality we need to appropriately merge the behavior of the two implementations
  • [ ] Write new tests for any new definitions
  • [ ] Write documentation describing the new interface

I'd also like my name added to the LICENSE (and I believe @mlubin is also within his rights to request this, but I'll let him speak for himself). I believe doing this requires Theo's permission?

Are there any additional things to be done apart from the list above?

longemen3000 avatar Apr 20 '22 05:04 longemen3000

Most of the load time of ForwardDiff is actually due to StaticArrays - that is, I think, only used for the Hessian, Jabobian, etc. functionality, so a package focused on dual-numbers should load very quickly.

oschulz avatar May 07 '22 02:05 oschulz

It is necessary for FD DualNumbers to support SpecialFunctions, NaNMath or Calculus

I think if it's lightweight enough there would be a chance to convince SpecialFunctions, NaNMath, etc. to support it, instead of the other way round.

oschulz avatar May 07 '22 02:05 oschulz

On the other part, SpecialFunctions already loads ChainRulesCore

Supporting ChainRulesCore would open so many doors. StatsFuns, for example, defines a lof of ChainRulesCore.@scalar_rules, but there are pretty much unusable at the moment because ForwardDiff doesn't utilize them.

oschulz avatar May 07 '22 02:05 oschulz

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.

mcabbott avatar May 07 '22 02:05 mcabbott

Looking at DualNumbers.jl direct dependencies on github, not all of those have a dependency in their latest version:

  • AIBECS; Does not depent on DualNumbers.jl in their latest version
  • ApproxFun; https://github.com/JuliaApproximation/ApproxFun.jl/blob/master/src/Extras/dualnumbers.jl . see also https://github.com/JuliaApproximation/ApproxFun.jl/issues/764
  • ApproxFunBase: https://github.com/JuliaApproximation/ApproxFunBase.jl/blob/ffad968dd10ac28477cec04d28036ffdbd879752/src/LinearAlgebra/helper.jl#L83-L84
  • ApproxFunFourier: Does not depend on DualNumbers.jl in their latest version
  • ApproxFunOrthogonalPolynomials: Does not depend on DualNumbers.jl in their latest version
  • ApproxFunSingularities: Does not depend on DualNumbers.jl in their latest version
  • DualMatrixTools: https://github.com/briochemc/DualMatrixTools.jl/blob/master/src/DualMatrixTools.jl
  • F1Method: : Does not depend on DualNumbers.jl in their latest version
  • HypergeometricFunctions: it depends, in multiple files https://github.com/JuliaMath/HypergeometricFunctions.jl
  • Interpolations: Does not depend on DualNumbers.jl in their latest version
  • InvariantMeasures: https://github.com/orkolorko/InvariantMeasures.jl/blob/ebe81b98efb229eca19c189eff0a203562ed5a47/src/NewChebyshev.jl#L221-L225 (is this legal)
  • Nabla: Does not depend on DualNumbers.jl in their latest version
  • Poltergeist: it depends https://github.com/wormell/Poltergeist.jl (last update was 2 years ago)
  • Quaternions: https://github.com/JuliaGeometry/Quaternions.jl/blob/master/src/DualQuaternion.jl
  • RiemannHilbert: https://github.com/JuliaHolomorphic/RiemannHilbert.jl/blob/master/src/LogNumber.jl
  • SALTBase: https://github.com/wsshin/SALTBase.jl/commit/42faec4f2949f1bf5cbb7c133c0b9aee8ba7e2f4 (it used FD before and switched to DualNumbers)
  • SingularIntegralEquations: https://github.com/JuliaApproximation/SingularIntegralEquations.jl/blob/d4f5caeb37a76e58b131575897955ce1c29d5f36/src/Extras/normalderivative.jl

longemen3000 avatar May 07 '22 03:05 longemen3000

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.

Coming from you, that's almost an endorsement @mcabbott :-)

Maybe that's not that crazy at all? We don't want ChainRulesCore to become noticeably heavier, of course, now that it's making real inroads throughout the ecosystem - but maybe the cost wouldn't be high? We're currently at (Julia v1.8.0-beta3)

julia> @time_imports using ChainRulesCore
      3.1 ms  ┌ Compat
     58.1 ms  ChainRulesCore

If it's just 5 ms more or so, maybe that would be Ok? DualNumbers are quite fundamental after all - or at least will be once there's only one version of them around.

oschulz avatar May 07 '22 08:05 oschulz

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore [...] If it's just 5 ms more or so, maybe that would be Ok?

The package load times do suggest a certain graph of package dependencies (run in sequence in a single session):

julia> @time_imports using ChainRulesCore
      3.2 ms  ┌ Compat
     63.2 ms  ChainRulesCore

julia> @time_imports using Calculus
      3.3 ms  Calculus

julia> @time_imports using NaNMath
      1.6 ms  NaNMath

julia> @time_imports using SpecialFunctions
      0.9 ms  ┌ ChangesOfVariables
      0.3 ms  ┌ OpenLibm_jll
      3.0 ms  ┌ DocStringExtensions
      4.1 ms  ┌ IrrationalConstants
      0.6 ms  ┌ CompilerSupportLibraries_jll
      1.4 ms  ┌ LogExpFunctions
     17.4 ms      ┌ Preferences
     18.0 ms    ┌ JLLWrappers
     21.4 ms  ┌ OpenSpecFun_jll
    131.1 ms  SpecialFunctions

julia> @time_imports using DualNumbers
     13.7 ms  DualNumbers

Especially SpecialFunctions should clearly depend on a dual-numbers package and not the other way round. :-) And ChainRulesCore depending on dual-numbers would seem quite natural as well. And the potential benefits would be huge - we would quickly get a lot more dual-numbers/ForwardDiff-support throughout the ecosystem (especially in the statistics sector - DistributionsAD could just go away completely - but also in many other domains).

oschulz avatar May 07 '22 08:05 oschulz

DistributionsAD could just go away completely

ForwardDiff is not the main blocker, it's Tracker and ReverseDiff. There are only very few definitions for dual numbers remaining: https://github.com/TuringLang/DistributionsAD.jl/blob/master/src/forwarddiff.jl

devmotion avatar May 07 '22 10:05 devmotion

DistributionsAD could just go away completely ForwardDiff is not the main blocker, it's Tracker and ReverseDiff.

Ah, sorry, you're right of course. (Full) ChainRulesCore-support in Tracker and ReverseDiff would be so nice ...

Speaking of the statistics domain there's StatsFuns, though, with several @scalar_rule's that could make the respective functions ForwardDiff-compatible.

oschulz avatar May 07 '22 12:05 oschulz