ForwardDiff.jl Scalar indexing error when computing Jacobian with a `CUDA.CuArray`

Consider the following minimal reproducible example:

using CUDA, ForwardDiff
x = cu(ones(3))
f(x) = x.^2
ForwardDiff.jacobian(f, x)

Result on ForwardDiff 1.0.0:

julia> ForwardDiff.jacobian(f, x)
3×3 CuArray{Float32, 2, CUDA.DeviceMemory}:
 2.0  0.0  0.0
 0.0  2.0  0.0
 0.0  0.0  2.0

which is obviously correct.

On ForwardDiff 1.0.1:

ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.

Issue introduced in: https://github.com/JuliaDiff/ForwardDiff.jl/pull/739

Aug 08 '25 11:08 thomvet

Unfortunate but it is challenging to not accidentally break GPU support as long as it is completely untested - currently the package does not run a single test on a GPU, and I'm not even sure if there's any official claim regarding GPU compatibility in the docs.

For the issue here, two approaches seem possible: 1) Use a non-iterating fallback 2) Add overloads for GPU arrays

Aug 08 '25 11:08 devmotion

I agree with your points that it's not guaranteed to work with CuArrays anywhere and hence not having testing on it is a consequence. Anyhow, I would be happy to try and help.

But how?

Since the iterating functions were introduced for matrices with a special structure (diagonal etc.), maybe a non-iterating fallback is the logical implementation here (essentially revert to state before on non-diagonal matrices)?
Are GPU tests desirable? If yes, I could start adding a few basic tests in a PR.

Aug 08 '25 12:08 thomvet

Referencing https://github.com/JuliaDiff/ForwardDiff.jl/pull/634 here as well (performance impact of broadcasting vs. loops on Arrays).

Aug 08 '25 13:08 thomvet