ForwardDiff.jl icon indicating copy to clipboard operation
ForwardDiff.jl copied to clipboard

Implement hessian! for scalar x

Open ojwoodford opened this issue 1 year ago • 0 comments

If I have a function f(x::Real)::Real I have two options for computing f(x), f'(x), f''(x):

using ForwardDiff, DiffResults, StaticArrays, BenchmarkTools

function computehessian(f, x::AbstractArray)
    result = DiffResults.HessianResult(x)
    result = ForwardDiff.hessian!(result, f, x)
    return DiffResults.value(result), DiffResults.gradient(result), DiffResults.hessian(result)
end
computehessian(f, x::T) where T <: Number = map(first, computehessian(f ∘ first, SVector{1, T}(x)))

function computehessian2(f, x)
    df(x)  = ForwardDiff.derivative(f, x)
    d2f(x) = ForwardDiff.derivative(df, x)
    return (f(x), df(x), d2f(x))
end

x = rand()
w = rand()
s1 = rand()
s2 = rand()
f(w, s1, s2, x) = log(w * s1 * exp(-0.5 * x * s1 ^ 2) + (1-w) * s2 * exp(-0.5 * x * s2 ^ 2))
expandfunc(args, v) = args[1](args[2:end]..., v)
fixallbutlast(func, args...) = Base.Fix1(expandfunc, (func, args...))
g = fixallbutlast(f, w, s1, s2)
@btime computehessian($g, $x)
@btime computehessian2($g, $x)

  69.075 ns (1 allocation: 64 bytes)
  83.289 ns (0 allocations: 0 bytes)

The first approach, using the hessian! function, is faster because the value, 1st and 2nd derivatives can all be computed with a single call to f (using a DiffResult structure), whereas the second approach requires 4 calls of f to perform the same calculation. If f is slow, then the second approach is much worse. However, the first approach doesn't support scalar x::T (where T <: Number), requiring a wrapper to get around this. Furthermore, it causes an allocation, making the calculation slower than necessary.

Is it possible to implement the functionality of hessian! with a DiffResult input, but for scalar x? If so, is it possible to avoid the allocation (assuming f doesn't allocate)?

ojwoodford avatar Aug 16 '23 21:08 ojwoodford