ReverseDiff.jl icon indicating copy to clipboard operation
ReverseDiff.jl copied to clipboard

Support undifferentiated parameters for pre-recorded API

Open jrevels opened this issue 9 years ago • 2 comments

Right now, if I have a function f(a, b, c) and I only want to create a function which returns the gradient w.r.t. to a and b, I have two options:

  • ∇f(a, b, c) = ReverseDiff.gradient((x, y) -> f(x, y, c), (a, b)))
  • ∇f! = ReverseDiff.compile_gradient(f, (a, b, c)), and just ignore the c gradient that will pop out

The former has to re-record the function for every call, while the latter wastes some computation differentiating w.r.t. c.

We should support something akin to Tensorflow's placeholders for the pre-recorded API, allowing you to drop in updatable parameters that aren't differentiated against. This can be accomplished by recording the tape as normal, and then "turning off" differentiation on the selected parameters (the idiom for that currently is to set the tape to NULL_TAPE, but I'm going to play around with it). Some refactoring should probably be done to get the most out of this change performance-wise (e.g., allow the instantiation of a TrackedArray with deriv == nothing).

As for the API, I can think of two different paths we could take:

  • Select which arguments are to be differentiated against using a wrt function, e.g. ReverseDiff.compile_gradient(f, (wrt(a), wrt(b), c))
  • Select which arguments are not to be differentiated against using a param function, e.g. ReverseDiff.compile_gradient(f, (a, b, param(c)))

jrevels avatar Dec 12 '16 03:12 jrevels

I need to compute the gradient of a function in a tight loop where one parameter is constant and this feature would be exactly what I need.

IMHO, the second option would be more natural to me (e.g. using param).

Alexander-Barth avatar Apr 23 '18 12:04 Alexander-Barth

Is this still in the cards?

dpo avatar Feb 09 '22 04:02 dpo