ForwardDiff2.jl
ForwardDiff2.jl copied to clipboard
Dual leakage
I can make the dual type leak out into a global variable:
julia> function f(x)
global y = sin(x)
end
f (generic function with 1 method)
julia> ForwardDiff2.D(f)(1)*1
0.5403023058681398
julia> y
(0.8414709848078965 + 0.5403023058681398ϵ₁)
Presumably this can happen any time a differentiated value escapes the program, e.g. when you have a global cache or similar.
I am assume that's going to be useful for defining mutable buffers.
FWIW, this is not just academic since it can lead to bad gradients (effectively a form of perturbation confusion). For example:
julia> function f(x)
global y = sin(x)
end
f (generic function with 1 method)
julia> g(x) = x+y
g (generic function with 1 method)
julia> ForwardDiff2.D(f)(2)*1
-0.4161468365471424
julia> ForwardDiff2.D(g)(2)*1
0.5838531634528576
Compare Zygote:
julia> gradient(f, 2)[1]
-0.4161468365471424
julia> gradient(g, 2)[1]
1.0
This is a little contrived, but a function that updates and uses a global cache in some way could follow the same pattern and get silent incorrect gradients.