Optimisers.jl issues

Add `total(f, model)` to replace implicit `sum(f, Flux.params(model))`

21

This proposes to add some kind of differentiable `sum(f, trainable(x))` which walks the model. I'm not certain this is the right thing yet. Right now this gets all trainable parameters....

mcabbott

enhancement

Add `trainables_nt`

6

This is a proposal for an alternative to `destructure` which doesn't completely flatten the parameters but returns a nested named tuple. The associated reconstructor can be be used on `ComponentArray`s...

CarloLucibello

Restructure makes a copy

4

In some situations, you have to restructure a lot if you use Flux, for instance if you want to run your batches as seperate solves in DiffEqFlux using an EnsembleProblem....

linusheck

enhancement

Add in-place `destructure!`

9

This adds a variant of `destructure` with minimal changes such that it writes back into the original model, instead of creating a copy. This may close #146, cc @glatteis Marked...

mcabbott

Document `destructure` handling shared parameters differently to ComponentArrays.jl

This came up here: https://discourse.julialang.org/t/on-the-future-of-flux-destructure-and-sciml-integration/104760/4 Maybe the idea that ComponentArrays.jl treats shared arrays as independent should be mentioned as an important difference, perhaps at the end of this section: https://fluxml.ai/Optimisers.jl/dev/#Obtaining-a-flat-parameter-vector...

mcabbott

documentation

`destructure` doesn't work on Dictionaries

1

`destructure` uses `map`, I think from before support for Dict was added elsewhere, hence this fails: ```julia julia> d = Dict( :a => Dict( :b => Dict( :c => 1,...

mcabbott

bug

update readme

4

https://github.com/FluxML/Optimisers.jl/pull/136 missed some stray uses of `state` in the readme.

mcabbott

Adam optimizer can produce NaNs with Float16 due to small epsilon

3

Dear All, I have found a bug in Adam's optimiser when used with Float16. An MWE woud look like this ```julia using Optimisers x = Float16[0.579, -0.729, 0.5493] δx =...

pevnak

Type instability in `Flux.setup`

7

``` using Flux function test_setup(opt, s) state = Flux.setup(opt, s) return state end s = Chain( Dense(2 => 100, softsign), Dense(100 => 2) ) opt = Adam(0.1) @code_warntype test_setup(opt, s)...

Vilin97

write `>>` as infix notation for `OptimiserChain`

12

This lets you chain optimisation rules like `setup(ClipGrad(1.0) => WeightDecay() => Descent(0.1), model)`. It's just notation & printing, there is no functional change to `OptimiserChain`. Edit: now uses `>>` instead...

mcabbott

Optimisers.jl
Optimisers.jl copied to clipboard

Metadata

Add `total(f, model)` to replace implicit `sum(f, Flux.params(model))`

Add `trainables_nt`

Restructure makes a copy

Add in-place `destructure!`

Document `destructure` handling shared parameters differently to ComponentArrays.jl

`destructure` doesn't work on Dictionaries

update readme

Adam optimizer can produce NaNs with Float16 due to small epsilon

Type instability in `Flux.setup`

write `>>` as infix notation for `OptimiserChain`

← Metadata

Owner

Metadata

Optimisers.jl Optimisers.jl copied to clipboard

Metadata

← Metadata

Owner

Metadata

Optimisers.jl
Optimisers.jl copied to clipboard