Michael Abbott comments

Results 1315 comments of


                                            Michael Abbott

Remove bad non-differentiable

Yes, the method is: ``` rand(r::AbstractRNG, ::Type{X}, dims::Dims) where {X} = rand!(r, Array{X}(undef, dims), X) ``` If we allow `rand!(rng, array)` then it's possible we should allow `rand!(rng, array, eltype)`...

use ProjectTo in Array addition

I see it has `test_rrule(+, randn(3), randn(3,1), randn(3,1,1))` for this reshape, but maybe test Diagonal + Matrix or something?

use ProjectTo in Array addition

The bug is that this doesn't work: ```julia julia> ProjectTo(Diagonal([1,2,3]))(randn(3,3,1)) ERROR: MethodError: no method matching (::ProjectTo{Diagonal, ... }}}}}}}})(::Array{Float64, 3}) julia> Diagonal(randn(3)) + randn(3,3,1) # but this does 3×3×1 Array{Float64, 3}:...

use ProjectTo in Array addition

The fix is part of https://github.com/JuliaDiff/ChainRulesCore.jl/pull/446, BTW. Which I think should be merged.

Rules for zip missing

Xref an attempt to do this within Zygote: https://github.com/FluxML/Zygote.jl/pull/785 Which always got stuck on yet more weird cases.

Derivative lost in 3-arg mul

That's not good! Yes it's caused by the PR, but I don't know if that's where the bug is. This becomes: ```julia julia> ForwardDiff.derivative(x -> sum(LinearAlgebra.mat_mat_scalar(a, b, x)), 0.0) 0.0...

Incorrect gradient with cascaded reshape, linear equation solver operations

I think this should be fixed by #481, can you confirm?

syntax error when array and index shares the same symbol

Right, this won't work. The immediate error is that it tries to define a function `rhs(a,b,a,c,b) = a[a, c] * conj(b[c, b])` (to infer the element type), but later on...

Configured rule for `maximum(f, xs)`

## First attempt With a more expensive function: ``` julia> @btime gradient(x -> sum(maximum(log∘exp, x)), $(rand(30,30))); 34.791 μs (162 allocations: 11.11 KiB) julia> @btime gradient(x -> sum(maximum(log∘exp, x, dims=1)), $(rand(30,30)));...

Configured rule for `maximum(f, xs)`

This has been much simplified. For the case of a complete reduction only, `maximum(f, x)`, this saves the position of the maximum, and calls `rrule_via_ad(f, x[i])` once. This saves memory...