ForwardDiff.jl
ForwardDiff.jl copied to clipboard
Is this code a supported use? (single-pass value and derivative)
Considering these two methods that compute the value and first derivative of a scalar function in a single pass:
import ForwardDiff
import DiffResults
@inline function value_and_derivative(f, Y::Type, x::Real)
diffresult = ForwardDiff.derivative!(DiffResults.DiffResult(zero(Y), zero(Y)), f, x)
return DiffResults.value(diffresult), DiffResults.derivative(diffresult)
end
@inline function value_and_derivative(f, x::Real)
T = typeof(ForwardDiff.Tag(f, typeof(x)))
ydual = f(ForwardDiff.Dual{T}(x, one(x)))
return ForwardDiff.value(T, ydual), ForwardDiff.extract_derivative(T, ydual)
end
-
The first method seems to be the officially recommended way to do this. However, (1) it introduces the
DiffResultsdependency just for this, and (2) it needlessly requires the caller to specifyf's return type ahead of the call. Neither are dealbreakers, but IMO they add friction for something that looks like it shouldn't have it (intuition says that the value comes for free when computing a derivative withForwardDiff) -
The second method uses the
TagandDualtypes as well as theextract_derivativefunction, which are not listed in the "Differentiation API" in the docs, so I'm not sure if they're considered part of the stable public API
Both methods run equally fast (and significantly faster than a naive two-pass implementation), so my question is: does the second method constitute a supported use of ForwardDiff's public API?
If such an use isn't supported (but maybe even if it is so—both method implementations appear too convoluted for a pretty common use case IMO), I'd like to suggest adding a value_and_derivative function with the second method to either this or one of the other packages in JuliaDiff (I'm willing to write a PR).
Related: #401, #391
EDITS: y-> ydual, ydual.value -> ForwardDiff.value(T, ydual), add another related issue
I can't comment on official API and design questions here, at least not in an official way. However, a quick note on
However, (1) it introduces the DiffResults dependency just for this
DiffResults is a dependency of ForwardDiff, so it's a dependency anyway, regardless of whether you load it or not. You might even be able to load ForwardDiff.DiffResults or ForwardDiff.DiffResult directly.
DiffResults is a dependency of ForwardDiff, so it's a dependency anyway, regardless of whether you load it or not. You might even be able to load ForwardDiff.DiffResults or ForwardDiff.DiffResult directly.
Thanks for the comment. That means that no extra packages are installed just for this simple use case (which is a good thing!). Unfortunately, it doesn't mean that one can avoid having to explicitly add DiffResults as a dependency (unless this is part of the public API). Honestly, I would care a lot less about this if I were to find a better solution to my other point (putting it another way, if DiffResults were less constraining for this simple case; or if I didn't have to use it at all).
Regarding that other point (i.e., the fact that, when using DiffResults, the return type must be known before the call), I'm thinking that an alternative to my initial suggestion could be to add a ForwardDiff.derivative[!] method that also returns a DiffResult but does not require a DiffResult as an input, for use in this scenario.
EDIT: By that I mean a adding a new method like this:
import ForwardDiff
import DiffResults
@inline function ForwardDiff.derivative!(::Nothing, f, x::Real) # Or just value_and_derivative(f, x::Real)
T = typeof(ForwardDiff.Tag(f, typeof(x)))
ydual = f(ForwardDiff.Dual{T}(x, one(x)))
return DiffResults.DiffResult(ForwardDiff.value(T, ydual), ForwardDiff.extract_derivative(T, ydual))
end
The ::Nothing parameter used for dispatch could be replaced with some other type (even some kind of "empty" DiffResults object) for the same purpose.
~~Note: a quick (and definitely non-scientific) benchmark I did showed that wrapping the value and derivative in a DiffResult object carries an overhead, so I'd still prefer to use a function that just returns a plain tuple.~~ (After running the same benchmark again many times, I no longer see a difference).
I've found a couple of implementations of the second method in SciML packages, so there's definitely demand for such a method, as well as some precedent of treating Dual (and related types/functions) as part of ForwardDiff's API.
Small caveat is that SimpleNonLinearSolve was extracted and to a large extent copied from NonLinearSolve just some days ago, so the links are basically a single example of the second method.
I am doing something similar in a private package. I don’t think it’s official API, so I just have a few tests that will be indicative when things break.
I guess, whether you are fine with something like this depends on your risk persona and application area.
@thomvet Well, I ended up doing the same. I still hope this type of usage can be included in the public API (or clarified as such if it's already meant to be API) so as to be able to avoid future breakage.