ForwardDiff.jl icon indicating copy to clipboard operation
ForwardDiff.jl copied to clipboard

Performance issue/allocations in DiffResults-hessian! with StaticArrays.

Open DNF2 opened this issue 1 year ago • 0 comments

When calculating hessians with DiffResults and StaticArrays there are unexpected allocations:

using DiffResults, ForwardDiff, StaticArrays, BenchmarkTools

g = r -> (r[1]^2 - 3) * (r[2]^2 - 2);
x = SA_F32[0.5, 2.7];
hres = DiffResults.HessianResult(x);
@btime ForwardDiff.hessian!($hres, $g, $x)
 -> 52.077 ns (2 allocations: 96 bytes)
ImmutableDiffResult(-14.547502, (Float32[5.2900004, -14.85], Float32[10.580001 5.4; 5.4 -5.5]))

This is due to Partials{N} being treated as a dynamically sized AbstractArray by extract_jacobian. There are two implementations of extract_jacobian:

@generated function extract_jacobian(::Type{T}, ydual::StaticArray, x::S) where {T,S<:StaticArray}
    M, N = length(ydual), length(x)
    result = Expr(:tuple, [:(partials(T, ydual[$i], $j)) for i in 1:M, j in 1:N]...)
    return quote
        $(Expr(:meta, :inline))
        V = StaticArrays.similar_type(S, valtype(eltype($ydual)), Size($M, $N))
        return V($result)
    end
end

function extract_jacobian(::Type{T}, ydual::AbstractArray, x::StaticArray) where T
    result = similar(ydual, valtype(eltype(ydual)), length(ydual), length(x))
    return extract_jacobian!(T, result, ydual, length(x))
end

and Partials{N} will hit the last one, leading to allocation of a Matrix{T}.

I'm not sure if the correct solution is to add Partials to the argument list of the @generated function, or to change what similar does. As a small test I just defined a new method for extract_jacobian:

@generated function extract_jacobian(::Type{T}, ydual::Partials{M}, x::S) where {M, T, S<:StaticArray}
    N = length(x)
    result = Expr(:tuple, [:(partials(T, ydual[$i], $j)) for i in 1:M, j in 1:N]...)
    return quote
        $(Expr(:meta, :inline))
        V = StaticArrays.similar_type(S, valtype(eltype($ydual)), Size($M, $N))
        return V($result)
    end
end

After this change, I get:

1.11.1> @btime ForwardDiff.hessian!($hres, $g, $x)
  16.650 ns (0 allocations: 0 bytes)
ImmutableDiffResult(-14.547502, (Float32[5.2900004, -14.85], Float32[10.580001 5.4; 5.4 -5.5]))

DNF2 avatar Nov 15 '24 12:11 DNF2