Kyle Daruwalla
Kyle Daruwalla
Why does this depend on `Zeros`? If this is _the_ branch, then I guess let's rebase it?
I'm not sure what you mean. Why is this not enough ```julia Optimisers.init(o, ps::Params) = [Optimisers.init(o, p) for p in ps] ```
You mean where we allow passing in `[W, b]`? Wouldn't that route be handled by Optimisers.jl already?
Not at the computer, but a quick suggestion would be to use `Flux.fmap` (from Functors.jl) to modify the parameters instead of dealing with `Params`. Similarly use `loadmodel!` instead of `loadparams!`...
Looks like Brian beat me to the punch, but I already typed this out, so here it is. ---- Okay let me see if I understand the problem statement correctly....
> I did not completely understand the walk_trainable function yet. You can think of a functor as a tree of nodes representing the model, with the array parameters as the...
Most of the cost is walking the model which you have to do no matter what. The `rebuild` method of each model is cheap to throw away, I think. It...
Any updates on this (like benchmarks after unfusing)?
We could transition to Pollen
As far as optimization goes, I don't think the Lux optimizations do what you're proposing. Instead, they recursively go through the `Chain` to wrap functions that don't adhere to the...