Optimisers.jl
Optimisers.jl copied to clipboard
do not accumulate updates in presence of shared gradient
but rather assume that the gradient has already been cumulated. See https://github.com/FluxML/Optimisers.jl/pull/192/files#r1835058503