Anthony Blaom, PhD
Anthony Blaom, PhD
Thanks for that, but I think I was not clear enough. My understanding is that a Flux RNN *must* be trained on batches that are all the same size. Calling...
@pat-alt Would you have any time and interest in addressing this issue?
Thanks @pat-alt for this work! > In any case, I can't really get either of the approaches you suggest to work in this particular case, so we may indeed want...
@pat-alt I don't think your use of `Functors.fmap` is valid here. The `penalty` function takes a tuple of matrices, as returned by `Flux.params(chain)`, and returns a single aggregate number. Your...
@ToucheSir Thanks for the prompt response and offer of help. So, with the apparatus you describe (Functors.jl, etc ) what code replaces the following to avoid the `params` call, working...
Or if you prefer, how should the regularisation example in the Flux documentation be re-written (without the weight-decay trick , which does not work for L1 penalty)?
Thanks for the help @ToucheSir . Unfortunately, `Optimisers.total` is [not working for me](https://github.com/FluxML/Optimisers.jl/pull/57#discussion_r1319211550). I've tried some variations on that approach but without any luck. I suggest we wait on the...
Great, thanks! I expect we will be incorporating elements of the two packages in https://github.com/alan-turing-institute/MLJ.jl/issues/139
@mcabbott I'm trying to adapt the `total` method proposed here for a different [use case](https://github.com/FluxML/MLJFlux.jl/issues/221#issuecomment-1707604760) and wondered if the following was expected behaviour: ```julia chain = Chain(Dense(3=>5), Dense(5=>1, relu)) f(A)...
Okay, nevermind. I was using `Base.isnumeric` instead of the local one.