DiffOpt.jl Possible performance improvement?

following an initially-unrelated discussion with @andrewrosemberg

I think there might be room for (substantial 🤔🤞) performance improvements in the NonLinearProgram code. Note: what I write below might apply to other classes of problems, I only checked the nonlinear code so far.

TLDR: I think we can replace {a lot of linear system solves} with {a single linear system solve} during forward/reverse diff

Medium-long explanation: Forward/reverse diff compute Jacobian-vector products of the form J*v or J'*v, and the Jacobian J is of the form K\N, where K, N are matrices. Currently, to obtain w=J*v, we solve KJ = N (matrix RHS), then compute J*v. I believe we can instead compute w = Jv by solving Kw = (Nv), which requires a single linear system solve (with vector RHS).

Why it matters for the ACOPF example I work with, on a 300-bus case with loads as parameters, K has size 13k x 13k, and N is 13k x 200. Replacing K\N with K\(Nv) potentially yields a 200x reduction in memory and time.

Complete thought process and stack traversing. I'm focusing on forward diff for the sake of example, same remarks apply to reverse diff

Forward diff returns primal/dual sensitivities in a ForwCache structure here: https://github.com/jump-dev/DiffOpt.jl/blob/df4a7dcee059ee048ff620590ff5478d8b2c79cc/src/NonLinearProgram/NonLinearProgram.jl#L534-L537
These are obtained from a left multiplication with Jacobian here https://github.com/jump-dev/DiffOpt.jl/blob/df4a7dcee059ee048ff620590ff5478d8b2c79cc/src/NonLinearProgram/NonLinearProgram.jl#L530-L532
Said Jacobian Δs is computed by the _compute_sensitivity function. Stepping into that call, said jacobian is obtained from _compute_derivatives_no_relax https://github.com/jump-dev/DiffOpt.jl/blob/df4a7dcee059ee048ff620590ff5478d8b2c79cc/src/NonLinearProgram/nlp_utilities.jl#L457
which then points us to a linear system solve https://github.com/jump-dev/DiffOpt.jl/blob/df4a7dcee059ee048ff620590ff5478d8b2c79cc/src/NonLinearProgram/nlp_utilities.jl#L425-L427

The alternative implementation I propose would cache the factorization of K (which seems to already be the case), and lazily compute Jacobian-vector products w=J*v by solving w = K\(Nv). The math would be similar (I think) for Jacobian-transpose-vector products

Sep 10 '25 08:09 mtanneau

Potential counter-arguments (and counter-counters) to the above:

If one wants to compute/retrieve the entire Jacobian --> I don't see the full Jacobian being stored/exposed at the moment; this functionality would have to be exposed first
if one wants to compute large numbers of Jacobian-Vector products, it may be more efficient to form the full Jacobian J, then do matrix-vector products. --> Note that if K (and its factorization) are sparse, it might still be more efficient to compute Jv = K \ (Nv) (sparse matvec + sparse solve) rather than a dense matvec product.

Sep 10 '25 08:09 mtanneau

Good point @mtanneau . I missed that we are using K \ N in DiffOpt. Indeed, we should use K \ (N v), which is more memory savvy. The factorization of K should be kept in the cache.

If we want to compute the full Jacobian, it would be better to leverage a linear solver that supports solving linear system with sparse right-hand-side (like Pardiso). As far as I know, I don't think we need evaluating the full Jacobian in our current applications.

Sep 10 '25 08:09 frapac

if one wants to compute large numbers of Jacobian-Vector products, it may be more efficient to form the full Jacobian J, then do matrix-vector products.

I think the code was written with that in mind. In view of your comment, we should indeed probably do what you suggest at least by default. We could have an option to keep the old behavior in case we have a use case in which it is faster (probably a case where the matrix of constraints is rather dense and we make a lot of Jacobian-Vector products...

Sep 10 '25 10:09 blegat

I agree, we dont want the full Jacobian. We should apply the suggested changes.

Sep 10 '25 11:09 joaquimg

I should be able to PR the changes, I’ll likely have questions regarding the overall architecture/ potential changes to existing APIs

Sep 11 '25 02:09 mtanneau