Michael Abbott comments

Results 1315 comments of


                                            Michael Abbott

Use/export `LogExpFunctions.jl`?

Sure, I guess I mean, did this come up somewhere? Every rule in https://github.com/JuliaDiff/ChainRules.jl/blob/main/src/rulesets/Base/arraymath.jl (except for `+` & `-`) closes over things without preventative copies. So they rely on you...

add sparsemax

Here's an attempt at a faster version: ```julia function sm5(x::AbstractArray; dims::Integer=1) z = if x isa AbstractVector dims == 1 || return float(x) # do-nothing case, same return type sort(float(x);...

add sparsemax

> Current version has the correct implementation Trying to check the gradient numerically: ``` julia> x = rand(1:5, 5,10) ./ 5; julia> delta = 0 .* x .+ randn.(); julia>...

add sparsemax

No, `Δ` is the backward gradient we receive, `y` is the sparsemax output. I'm not sure I ever decoded the paper's notation. But the question of whether the gradient is...

More batched functions such as batched_svd, batched_diagm

@Roger-luo was going to collect many of these in https://github.com/Roger-luo/BatchedRoutines.jl but that was a while ago. Otherwise here sounds OK to me. `batched_mul` got pretty complicated as it wanted to...

Add gradients for `conv_bias_act`, and a similar `dense_bias_act`

My memory is that this basically worked, but the performance was disappointing due to https://github.com/JuliaLang/julia/issues/43153 . Writing back into the same `x` (when safe) saved memory but not time, unless...

Add gradients for `conv_bias_act`, and a similar `dense_bias_act`

Rebased at https://github.com/mcabbott/NNlib.jl/tree/bias_act_22 after squashing, but its own tests fail.

`gather` is not friendly with matrix of size 0 on GPU

Looks like `gather` allocates an empty array of the right size: https://github.com/FluxML/NNlib.jl/blob/master/src/gather.jl#L76 so this can probably be fixed by adding a short-circuit like `isempty(dst) && return dst` in `gather!`, before...

Add minimal infrastructure for the docs

What's next, should these show up at https://fluxml.ai/NNlib.jl/dev/ ? If so, should this package's readme link to that?

Lazy broadcasting bug?

Thanks, that's much simpler than my example! The suggested change means that `LazyArray(@~ 1 .* 2)` is a 0-array, and this is preserved by broadcasting with scalars. But ordinary broadcasting...