ForwardDiff.jl
ForwardDiff.jl copied to clipboard
derivative of `norm` at 0
norm
is not differentiable at 0, so at best you can return a subgradient. It appears that the subgradient is 1.0 at 0.0 (and -1.0 at -0.0).
julia> ForwardDiff.gradient(norm, [0.0, 0.0])
2-element Array{Float64,1}:
0.0
1.0
julia> ForwardDiff.gradient(norm, [0.0, -0.0])
2-element Array{Float64,1}:
-0.0
-1.0
I'm wondering if it would be worth it to define Base.norm
on ForwardDiff.Dual
, and return a subgradient of 0.0 at both 0.0 and -0.0
Also perhaps I missed this, but I think it would be nice to mention somewhere that in generic auto-diffable code sqrt(sum(v.^2))
should be replaced with norm
, since sqrt
is singular at 0, and produces a NaN
when composed with a function with 0 gradient (0*Inf = NaN).
Here is an interesting effect that I am guessing is related?
using ForwardDiff, StaticArrays
# - ForwardDiff 0.7.3
# - StaticArrays 0.6.6
u = x -> (1.0 + norm(x)^2)^(-1/2)
∇u = x -> ForwardDiff.gradient(u, x)
∇u(zeros(2))
# 2-element Array{Float64,1}:
# -0.0
# -0.0
∇u(@SVector zeros(2))
# 2-element SVector{2,Float64}:
# NaN
# NaN
(u = x -> (1.0 + sum(x.^2))^(-1/2)
works fine for both)
Here's something to think about related to this issue:
using ForwardDiff
using LinearAlgebra
# start with zero valued vector of dual numbers
v = zeros(ForwardDiff.Dual{Nothing, Float64, 1}, 3);
# assume a perturbation of one component exists due to some computational noise
value = 1.0e-200 # so that value^2 == 0.0 (due to machine precision)
partial = 1.0e-100 # so that 2*value*partial != 0.0 (due to machine precision)
v[1] = ForwardDiff.Dual{Nothing}(value, partial);
# try out the two methods
norm(v)
# Dual{Nothing}(1.0e-200,NaN)
sqrt(sum(v.^2))
# Dual{Nothing}(0.0,Inf)
Both implementations will result in NaNs propagated throughout the function, even in NaN-safe mode. I encountered this when propagating derivatives through a Newton solve and it took me a lot of time to find the issue.
(...) and it took me a lot of time to find the issue.
Could you please share how you ended up working around it?
I created a new issue related to my comment as it is somewhat tangential to this issue. I'll post my workaround there.