DiffEqFlux.jl
DiffEqFlux.jl copied to clipboard
Specifying adjoint methods for gradient calculations?
Apologies if this is a very basic question: before all the changes in the DiffEqFlux.jl library one could specify different adjoint methods (in the solve call to the neural ODE) which were then used in sciml_train (my understanding of how that worked is quite rudimentary). But now optimization is done via Optimization.jl and changing the adjoint methods in solve seems to do nothing.
I haven't benchmarked this properly (so it might just be my imagination) but current adtype options in Optimization.jl feel slower than when was using sciml_train with the adjoint options. Is it still possible to specify the adjoint methods to be used in Optimization.jl gradient calculations in some way or is that for some reason removed (I don't mean only DiffEqFlux.jl)?
Also, sorry if this isn't the right repo for that, wasn't sure whether I should've asked this in Optimization.jl or here since this is the original library I had used. Many thanks for the great libraries!
I haven't benchmarked this properly (so it might just be my imagination) but current adtype options in Optimization.jl feel slower than when was using sciml_train with the adjoint options.
It's the same code. Basically, sciml_train was just moved to become a full documented package.
Is it still possible to specify the adjoint methods to be used in Optimization.jl gradient calculations in some way or is that for some reason removed (I don't mean only DiffEqFlux.jl)?
Yes, same syntax (sensealg). It's documented in the docstrings:
https://diffeqflux.sciml.ai/stable/layers/NeuralDELayers/#DiffEqFlux.NeuralODE
Ah, many thanks! Hadn't read the SciMLSensitivity docs (and didn't click in my head) that AutoForwardDiff would override/ignore sensealg and switching to AutoZygote was the key.