Michael Abbott
Michael Abbott
I see I'm blamed in https://github.com/FluxML/Flux.jl/pull/1921 for suggesting that change, although I've forgotten why. With the code above, I see similar numbers to you, `grouped_conv` is faster but has many...
That would be wonderful, sorry I've been snowed under & haven't got around to trying. [This file](https://github.com/mcabbott/TransmuteDims.jl/blob/master/src/strided.jl) is the all that touches Strided.jl. Edit: when I make a quick attempt,...
Thanks for the info above. I was also trying a bit here: https://github.com/mcabbott/TransmuteDims.jl/pull/43 and got tests passing, but perhaps there's a bit more to do.
Xref https://github.com/JuliaLang/julia/issues/55735 about why this was removed from the original `stack` PR. At an earlier point, it did infer things like `stack(empty!([(1,2,3)]))` to be `3×0 Matrix`. The same could have...
I think this is https://github.com/JuliaDiff/ChainRules.jl/issues/567 .
Yes it will change both: ```julia julia> os = Optimisers.setup(OptimiserChain(SignDecay(0.1), AdamW(lambda=0.1)), (; x=[12 34.])) (x = Leaf(OptimiserChain(SignDecay(0.1), AdamW(0.001, (0.9, 0.999), 0.1, 1.0e-8, true)), (nothing, ([0.0 0.0], [0.0 0.0], (0.9, 0.999)))),)...
All things are possible but I don't think we should add further complexity to `adjust!`'s interface. The less-breaking way is to change all 3, and then overload the implementation of...
Note that because of an earlier re-naming (maybe from what Flux called things, maybe to match the AdamW convention, when AdamW made a chain with WeightDecay) you can in fact...
Shorter example: ```julia julia> Tracker.gradient(sum∘vcat, [1.0, 2.0], [3.0]) # fine ([1.0, 1.0] (tracked), [1.0] (tracked)) julia> Tracker.gradient(sum∘vcat, 1.0, 2.0) # creates a vector where it wants a scalar ERROR: MethodError:...
Note that this MWE doesn't run: ``` julia> y = _default_autoback_hesvec_cache(x, v) ERROR: UndefVarError: `_default_autoback_hesvec_cache` not defined in `Main` Suggestion: check for spelling errors or missing imports. Stacktrace: [1] top-level...