Zygote.jl Gradient of dictionary doesn't contain the keys with zero gradient

Gradient of dictionary doesn't contain the keys with zero gradient

Open CarloLucibello opened this issue 2 years ago • 1 comments

I would expect the gradient of a dictionary to behave like the gradient of a named tuple and contain all of the keys of the original object. For the dict instead, keys with zero gradient (nothing) are dropped:


julia> loss(model) = sum(abs2, model[:a])
loss (generic function with 1 method)

julia> nt = (a = [1.0,2.0], b = [3.0,4.0], c = 1);

julia> gradient(loss, nt)[1]
(a = [2.0, 4.0], b = nothing, c = nothing)

julia> d = Dict(:a => [1.0,2.0], :b => [3.0,4.0], :c => 1);

julia> gradient(loss, d)[1]
Dict{Any, Any} with 1 entry:
  :a => [2.0, 4.0]

Dec 01 '22 05:12 CarloLucibello

Zygote auto-canonicalizes custom structs/NamedTuples but does not do so for Dicts. I don't quite understand the theory behind canonical vs non-canonical tangent types (ChainRules docs talk about it a little here it seems), but one counter-argument would be that Dicts are to NamedTuples what sparse arrays are to dense arrays. In other words, not all indices (keys) in the primal need to be defined in the tangent. Which approach is faster/more ergonomic/more correct? I'm not sure.

Dec 02 '22 04:12 ToucheSir

Zygote.jl Zygote.jl copied to clipboard

Gradient of dictionary doesn't contain the keys with zero gradient

Zygote.jl
Zygote.jl copied to clipboard