Michael Abbott
Michael Abbott
Needs https://github.com/FluxML/NNlib.jl/pull/457 Replaces some of #2137
Like https://github.com/FluxML/Flux.jl/pull/1656 this wants to make `outputsize(Embedding(3 => 4), (5,)) == (4, 5)`. That is, it thinks the size referred to by `outputsize` should be the size of the array...
If `train!` stops accepting implicit parameters, as in #2082, then its loss function needs to accept the model as an argument, rather than close over it. This makes all the...
We're discussing removing callbacks in favour of just writing a loop. One nice thing you can do with callbacks is `cb = throttle(mysave, 60)` to save roughly once a minute,...
Before, Diffractor does not like mutation, thus fails on RNNs: ```julia julia> using Flux, Zygote, Diffractor julia> Zygote.gradient(m -> sum(abs2, m([1 2; 3 4f0])), RNN(2 => 3; init=Flux.ones32)) ((cell =...
The present `show` methods have about a 20s startup delay when the model is on the GPU. This comes from the checks like `any(isnan, x)` which print friendly warnings. Perhaps...
Closes https://github.com/JuliaDiff/ChainRules.jl/issues/698, needs https://github.com/JuliaDiff/ChainRulesCore.jl/pull/615
```julia julia> using Zygote, BenchmarkTools julia> @macroexpand1 @fastmath maximum(x) # this is new :(Base.FastMath.maximum_fast(x)) julia> VERSION v"1.10.0-DEV.421" julia> gradient(x -> sum(maximum(x; dims=1)), [1,3,2]) ([0.0, 1.0, 0.0],) julia> @fastmath gradient(x ->...
As I was reminded here https://discourse.julialang.org/t/in-using/90056/5 this package tends to assume that the name `TensorCast` is in scope where the macro runs. It shouldn't.
Does this, should be essentially free: ```julia julia> mask = Bool[1 1 1 0 0; 0 1 1 1 0; 0 0 1 1 1] 3×5 Matrix{Bool}: 1 1 1...