ForwardDiff.jl icon indicating copy to clipboard operation
ForwardDiff.jl copied to clipboard

wip: implement macro for users to define their own derivatives

Open KristofferC opened this issue 8 years ago • 8 comments

This is a start towards an official API for fixing https://github.com/JuliaDiff/ForwardDiff.jl/issues/89. It provides macros for derivatives, gradients and jacobians such that user defined derivatives are easily injected:

julia> f(x) = x^2 + x;

julia> Df(x) = (println("I got called"); 2x + 1);

julia> ForwardDiff.@implement_derivative f Df
f (generic function with 2 methods)

julia> ForwardDiff.derivative(f, 2.0)
I got called
5.0

For gradients and jacobians there are two "config" objects that can be used, which cache intermediate arrays needed.

I don't think we can support user defined Hessians at the moment because of the composition of gradient and jacobian that is currently being used.

TODOs:

  • [ ] The documentation is a bit so-so. Just wanted to get something up quickly.
  • [x] Something weird happens when I use this in a function:
julia> function test()
           h(x) = 3x
           dh(x) = 3
           ForwardDiff.@implement_derivative h dh
           ForwardDiff.derivative(h, 2.0)
       end
test (generic function with 1 method)

julia> test()
------ UndefVarError ------------------- Stacktrace (most recent call last)

 [1] — test() at REPL[2]:2

UndefVarError: dh not defined

I opened a thread on the forum: https://discourse.julialang.org/t/question-about-scoping-inside-a-function/709

Conclusion Lowering bug in julia 0.5. Need to define the derivative before the function.

  • [x] Performance checks, if function + derivative is non allocating, then the call to ForwardDiff should be non allocating assuming that all the cache functionality is used.
  • [x] Figure out why this is needed: https://github.com/JuliaDiff/ForwardDiff.jl/pull/165/files#diff-a51666f82528f840f37fcc362fb74659R16 [Fixed by https://github.com/JuliaDiff/ForwardDiff.jl/pull/162]

KristofferC avatar Dec 03 '16 14:12 KristofferC

A bit blocked here due to https://discourse.julialang.org/t/question-about-scoping-inside-a-function/709/12

KristofferC avatar Dec 05 '16 07:12 KristofferC

I am thinking of using something like this in https://github.com/KristofferC/ContMechTensors.jl. I could basically define a bunch of known derivatives (like d/dA det(A) = det(A)inv(A)') and perhaps that would get better performance since I could reuse the det computations in the value and the derivative (a bit like the optimizations that are already being done for scalars but on a higher level). I could either form the matrix of partials on the stack or write a custom matrix multiply function that skips the value part. Will have to benchmark to see what is best.

KristofferC avatar Dec 05 '16 14:12 KristofferC

Benchmarks at https://gist.github.com/KristofferC/930dc1de9d64a6930d3f3c81da72cae4.

The user defined functions are non allocating but consistently slower. The reason for this (I believe) is that the function turns into Core.Box when the macro is used. These Boxes are pretty annoying, it feels I get them everytime I try to exploit a closure.

julia> function bench_derivative()
          dh(x) = 3 * one(x)
          h(x) = 3x
          h2(x) = 3x
          ForwardDiff.@implement_derivative h dh
          d_user = @benchmark ForwardDiff.derivative($h, 2.0)
          d_normal = @benchmark ForwardDiff.derivative($h2, 2.0)
          return user, normal
       end

julia> @code_warntype bench_derivative()
Variables:
  #self#::#bench_derivative
  dh::#dh#74
  h2::#h2#76
  h::Core.Box
  d_user::Any
  348::Any
  d_normal::Any
  349::Any

Edit: Boxes are not the only thing, they are a bit slower in global scope as well. However, the extraction of the jacobian and calling to blas might explain that. The scalar derivatives are the same speed in global scope.

KristofferC avatar Dec 05 '16 15:12 KristofferC

@KristofferC It seems the only item on your checklist is the docs, but is the main blocker here still the closure performance bug? If so, that can usually be avoided by explicitly using a callable type instead (I know it's annoying, though).

I'm curious because I'm excited for this feature, let me know if there's any way for me to help 😃

jrevels avatar Feb 11 '17 15:02 jrevels

The most helpful thing would be to read the docs I have written, comment on that and try the macro out for yourself to see how you think the workflow is.

I actually don't think the closure thing is a big problem. The reason these manual implementations are slower sometimes is that the matrix multiplication (in case of jacobians) that is being done to propagate the partials could very well be more expensive than to just evaluate the function with dual numbers.

KristofferC avatar Feb 11 '17 16:02 KristofferC

Rebased

KristofferC avatar May 04 '17 09:05 KristofferC

Is this progressing? My version of ForwardDiff does not implement these macros and Icoudl really use it :-)

BCRARL avatar Apr 25 '18 19:04 BCRARL

bump 😄

gszep avatar Oct 13 '21 09:10 gszep