Optimisers.jl
Optimisers.jl copied to clipboard
Optimisers.jl defines many standard optimisers and utilities for learning loops.
This proposes to add some kind of differentiable `sum(f, trainable(x))` which walks the model. I'm not certain this is the right thing yet. Right now this gets all trainable parameters....
This is a proposal for an alternative to `destructure` which doesn't completely flatten the parameters but returns a nested named tuple. The associated reconstructor can be be used on `ComponentArray`s...
In some situations, you have to restructure a lot if you use Flux, for instance if you want to run your batches as seperate solves in DiffEqFlux using an EnsembleProblem....
This adds a variant of `destructure` with minimal changes such that it writes back into the original model, instead of creating a copy. This may close #146, cc @glatteis Marked...
This came up here: https://discourse.julialang.org/t/on-the-future-of-flux-destructure-and-sciml-integration/104760/4 Maybe the idea that ComponentArrays.jl treats shared arrays as independent should be mentioned as an important difference, perhaps at the end of this section: https://fluxml.ai/Optimisers.jl/dev/#Obtaining-a-flat-parameter-vector...
`destructure` uses `map`, I think from before support for Dict was added elsewhere, hence this fails: ```julia julia> d = Dict( :a => Dict( :b => Dict( :c => 1,...
https://github.com/FluxML/Optimisers.jl/pull/136 missed some stray uses of `state` in the readme.
Dear All, I have found a bug in Adam's optimiser when used with Float16. An MWE woud look like this ```julia using Optimisers x = Float16[0.579, -0.729, 0.5493] δx =...
``` using Flux function test_setup(opt, s) state = Flux.setup(opt, s) return state end s = Chain( Dense(2 => 100, softsign), Dense(100 => 2) ) opt = Adam(0.1) @code_warntype test_setup(opt, s)...
This lets you chain optimisation rules like `setup(ClipGrad(1.0) => WeightDecay() => Descent(0.1), model)`. It's just notation & printing, there is no functional change to `OptimiserChain`. Edit: now uses `>>` instead...