Kyle Daruwalla comments

Results 404 comments of


                                            Kyle Daruwalla

Initialising weights outside of layer declarations

True, that's better!

Initialising weights outside of layer declarations

Small sidetone: I would make the initialization method the first arg to support the `do` syntax in case anyone needs it.

Initialising weights outside of layer declarations

The semantic definition of `bias=false` means that trying to load a numeric value into it is ignored. I think that extends to `reweight!` too.

Add literature background to the models

This is great! Looking ahead to the website step, I think it would be better and simpler to just host a "model zoo" or "tutorials" section in the Flux docs...

User-facing API like vmap

I posted my [long form comment](https://discourse.julialang.org/t/julias-broadcast-vs-jaxs-vmap/38990/37) on discourse, but here are the parts related to this discussion: - I think to get the performance of fused BLAS operations, the user...

User-facing API like vmap

Something like [this](https://julialang.zulipchat.com/#narrow/stream/137791-general/topic/.60jax.2Evmap.60.20vs.20Julia.20Broadcast/near/196838817) where "depth" refers to the depth in the call stack. Does KA do this? That's awesome!

Add model implementations

Go for it! For both ViT-based models, make sure to look at the existing ViT implementation and the Layers submodules. Torchvision models are good reference, but ultimately we want something...

Add minimal infrastructure for the docs

Btw the docs issue is now taken care of. Do we want the link in the README to point to the dedicated docs?

Tweak GoogLeNet and Inception family to match the torchvision implementations

We already have support for turning off the batch norm [here](https://github.com/FluxML/Metalhead.jl/blob/88bcacb81b643066adecdc9bb108a6dbcaad9e3c/src/layers/conv.jl#L15-L17). The remaining task on this issue it to update the code for GoogLeNet to use that functionality.

Deprecate Flux.Optimisers and implicit parameters in favour of Optimisers.jl and explicit parameters

And regularization e.g. https://github.com/FluxML/Optimisers.jl/pull/57