Kyle Daruwalla
Kyle Daruwalla
And note that an `apply` interface such as this (or Optimisers.jl) is about explicit (non-)mutation which is not the same as separation of parameters/state from the model (Lux does both)....
I agree. Given that you almost never want to set `children`, having to support and explain `children` vs `trainable` is just confusing for users. Anyone wanting to restrict `children` probably...
I have used Flux + FluxTraining quite a bit for recurrent models in the past. In general, you shouldn't need to do anything special. Most of the work is related...
If your data is already in a big rank-3 array, then you can make your axis order as `feature x samples x time`, and use `Base.Iterators` or MLUtils.jl to partition...
Note I edited my comments from the original to correct a mistake in the order of the axis dimensions. Clearly, the time I've been spending with Jax recently is leaking...
I assume the change to the primal does not occur outside of a `gradient` call (i.e. just `f(0.5)`)?
I just wanted to bring https://github.com/JuliaCommunity/ML-Coordination-Tracker to attention. We should reference this somewhere (e.g. a contributing section). There's also an MLH fellow helping us out with one of the community...
Most of those files are CSS for different code block themes? Can we just eliminate all of those files except the theme that we are currently using?
I think it's fine to post this on FluxML, but I agree the content should be expanded to include the full GSoC.
I see two things outstanding: (1) `imsize` support and (2) testing whether a model allows certain options. We can easily do (1) and throw a warning when the model does...