ForwardDiff.jl
ForwardDiff.jl copied to clipboard
confused by this root finding example
I have some code which calls a root finding step, via the Roots package that I'd like to be able to differentiate. It appears that whether ForwardDiff "works" in this setting depends on some subtlety that I don't quite understand.
For an MWE, consider finding the (unique) root of the cubic function f(x, k) = (x-k)^3 which has a root at x = k. Let g1(k) = find_zero(x -> f(x, k), [k-1.0,k+1.0]) and similarly let g2(k) = find_zero(x -> f(x, k), [10.0,+10.0]). Note that for values of k between -10 to +10, g1(k) == g2(k). In this range, the only difference between g1 and g2 is that the root finding step in g1 is "aware" of the value of k to look for, via the bounds I have provided, whereas it is ignorant of them in g2. For values of k in this range, the derivatives of g1 and g2 are identical, and equal to 1.0. However, I am seeing that ForwardDiff thinks the derivative of g2 is zero, while it finds the derivative of g1 to be 1.0:
Using Roots, ForwardDiff, FiniteDiff
g0(x,k) = (x-k)^3
g1(k) = find_zero(x -> g0(x,k), [-10.0, +10.0])
g2(k) = find_zero(x -> g0(x,k), [-1.0 * k, +1.0 * k])
Julia reports that the functions are the same, and do vary with the input in the expected way:
ulia> g1(3)
3.0
julia> g2(3)
3.0
julia> g1(4)
4.0
julia> g2(4)
4.0
However, the ForwardDiff gradients are different:
julia> ForwardDiff.derivative(g1, 3.0)
0.0
julia> ForwardDiff.derivative(g2, 3.0)
1.0
Note that FiniteDiff works fine here (up to numerical approximation):
julia> FiniteDiff.finite_difference_derivative(g1, 3.0)
0.9999999999991082
julia> FiniteDiff.finite_difference_derivative(g2, 3.0)
0.9999999999991082
Thanks in advance for any suggestions the ForwardDiff team has in understanding this difference.
Maybe that's actually a problem with Roots and the same problem as described in https://github.com/JuliaMath/Roots.jl/issues/314 (ForwardDiff derivative depends on the number of iterations)?