ForwardDiff.jl Using ForwardDiff with a vector function (as in IntelVectorMath)

Is there any way to efficiently use ForwardDiff with functions that operates on vectors and return vectors? Suppose I have a function f that operates on a vectors by applying some operation independently to each element (as in a broadcast operation). This, for example, is the case for many functions from IntelVectorMath or AppleAccelerate. Although, formally, f is a function from R^n to R^n, in reality it does not make sense to compute an Hessian for it, since it would be a diagonal matrix. Is there a way to have ForwardDiff return only diagonal elements?

Note that repeated calls to ForwardDiff.derivative, although feasible, would be a poor workaround, since would not allow me to use the fast math operations of IntelVectorMath or AppleAccelerate.

Jan 21 '20 17:01 astrozot

A vector of duals can be written as X = x + J * eps where x is a vector of real numbers and J is a matrix containing the "partials" and eps is the "dual unit" (cf imaginary unit).

For a function f that is R^n to R^n to support operating on a vector of dual values (above written X) it needs to itself return a vector of dual values such that:

f(X) = f(x) + [df / dx * J] * eps

You can see an example of this here: https://github.com/JuliaDiff/ForwardDiff.jl/pull/165/files#diff-25ac750be2a2ebb3d8cdbe9e820679d1R116-R130. If you know that df / dx is diagonal you could simplify the matrix multiplication to just scale the rows of J appropriately.

Now, one problem you have is that to call f(x) on a vectorized function you typically need the x in a vector/tuple. However, that is not how a vector of Duals is stored in ForwardDiff (the value and partials are stored together in a struct). So you would need to extract that and create something like x = [X[1].values, X[2].value, X[3].value, X[4].value] and then call fx = vectorized_sin(x), df_dx = vectorized_cos(x) and then finally create the output [Dual(fx[1], (df_dx[1] * J[1, :])...), Dual(fx[2], (df_dx[2] * J[2, :])...), ...].

For this to be efficient you probably want to do this with @generated and unroll the things like scaling the J etc.

Jan 21 '20 21:01 KristofferC

@astrozot does that work for you?

ForwardDiff.derivative(λ -> f(x .+ λ), 0.0)

Unless any call to ForwardDiff.derivative is bad? [Edit: I guess @KristofferC's reply means it's bad 😥]

Jan 21 '20 21:01 briochemc

How could that ever call a vectorized functions f (which is a black box function that only can take a vector of reals)? At some point you need to pull out the reals from the vector of duals.

Jan 22 '20 14:01 KristofferC