ForwardDiff.jl icon indicating copy to clipboard operation
ForwardDiff.jl copied to clipboard

Computational noise can corrupt NaN-safe computations

Open taylormcd opened this issue 3 years ago • 1 comments

Different amounts of computational noise in the value and partials of a dual number has the potential to corrupt even NaN-safe computations.

Minimum working example:

using ForwardDiff, LinearAlgebra

# start with zero valued vector of dual numbers
v = zeros(ForwardDiff.Dual{Nothing, Float64, 1}, 3);

# assume a perturbation of one component exists due to some computational noise
value = 1.0e-200 # so that value^2 == 0.0 (due to machine precision)
partial = 1.0e-100 # so that 2*value*partial != 0.0 (due to machine precision)
v[1] = ForwardDiff.Dual{Nothing}(value, partial);

# try out two functions
norm(v)
# Dual{Nothing}(1.0e-200,NaN)
sqrt(sum(v.^2))
# Dual{Nothing}(0.0,Inf)

Both implementations will result in NaNs propagated throughout the function, even in NaN-safe mode. I first encountered this after propagating derivatives through a Newton solve.

Originally posted by @taylormcd in https://github.com/JuliaDiff/ForwardDiff.jl/issues/243#issuecomment-539291591

taylormcd avatar Apr 13 '21 17:04 taylormcd

One possible workaround is to replace the zero-testing in the NaN-safe function definitions with approximate zero-testing as follows.

if NANSAFE_MODE_ENABLED
    isapproxzero(x) = isapprox(x, zero(x), atol=eps(promote_type(eltype(x), Float64)), norm=(v)->norm(v, Inf))

    @inline function Base.:*(partials::Partials, x::Real)
        x = ifelse(!isfinite(x) && isapproxzero(partials), one(x), x)
        return Partials(scale_tuple(partials.values, x))
    end

    @inline function Base.:/(partials::Partials, x::Real)
        x = ifelse(isapproxzero(x) && isapproxzero(partials), one(x), x)
        return Partials(div_tuple_by_scalar(partials.values, x))
    end

    @inline function _mul_partials(a::Partials{N}, b::Partials{N}, x_a, x_b) where N
        x_a = ifelse(!isfinite(x_a) && isapproxzero(a), one(x_a), x_a)
        x_b = ifelse(!isfinite(x_b) && isapproxzero(b), one(x_b), x_b)
        return Partials(mul_tuples(a.values, b.values, x_a, x_b))
    end
else

Here's the results with the modified code:

norm(v)
# Dual{Nothing}(1.0e-200,2.0e-100)
sqrt(sum(v.^2))
# Dual{Nothing}(0.0,2.0e-300)

While the partials of the results may not be strictly correct, they are still essentially zero and no NaNs are propagated. Additionally ForwardDiff tests still pass with these modifications.

That being said, there's a (possibly significant) performance penalty associated with this fix since isapprox is not typically used in performance-critical scenarios.

taylormcd avatar Apr 13 '21 17:04 taylormcd