Michael Abbott
Michael Abbott
Yes. Where's the best place? I don't want every docstring to become a novel, but perhaps the one for `Optimisers.setup` is one good place to show that this recurses into...
I repeat that no incorrect gradients have been displayed here. Calling other features you happen to dislike in some context gradient bugs is just muddying the waters. (There are known...
Re structured arrays, I suspect most of them should be marked `@functor`. I think you are suggesting that the sparse array outcome is undesirable, but I can't reproduce it on...
Ok. Adjoint should now reconstruct: ``` julia> destructure(rand(2)')[2]([1.0, 2.0]) 1×2 adjoint(::Vector{Float64}) with eltype Float64: 1.0 2.0 julia> destructure(transpose(rand(2,2)))[2]([1, 2, 3, 4]) 2×2 transpose(::Matrix{Float64}) with eltype Float64: 1.0 2.0 3.0 4.0...
> If another tree like gradient is passed, then f is applied to the leaves of gradient (i.e. approximately fmap(TrainableWalk(f), gradient, model) using the last argument to filter the walk)....
Wait there are two big differences from fmapstructure / Flux.state: * this is only trainable parameters, and * tuples & vectors become NamedTuples with made-up field names. ComponentArrays has no...
One possible design is this: ```julia reset!(tree) = foreach(reset!, tree) reset!(ℓ::Leaf) = ℓ.state = reset!(ℓ.rule, ℓ.state) reset!(::AbstractRule, ::Nothing) = nothing reset!(rule::AbstractRule, state) = throw(ArgumentError("""reset! does not now how to handle...
One thing we could try is adding `@nospecialize` to some `update!` methods? Or even to a whole block of its code.
Ah that looks great, thanks for digging! For me, with the example at top: ```julia julia> @btime $re($params); # This is the reconstruction cost min 92.167 μs, mean 301.432 μs...
Mean is from https://github.com/JuliaCI/BenchmarkTools.jl/pull/258, which I should eventually re-write to `@btime / @btimes` or `@bmin / @btime` or something. I see I did write some tests of this, it could...