Zygote.jl icon indicating copy to clipboard operation
Zygote.jl copied to clipboard

No Method Matching NamedTuple. Defining gradients through custom types

Open jessebett opened this issue 6 years ago • 13 comments

Taking the Zygote.gradient through a function of Taylor1 from TaylorSeries.jl is throws a Method error on NamedTuple with the symbols being the fields of the struct Taylor1 which are coeffs and order:

using Zygote
using Flux
using TaylorSeries #using master

F = Chain(Dense(1,10,σ),Dense(10,10))

shift_taylor(a,n) = a + Taylor1(typeof(a),n)
t = shift_taylor(1.,5)

Zygote.gradient(()->TaylorSeries.derivative(1,sum(F([t]))),params(F))

julia> Zygote.gradient(()->TaylorSeries.derivative(1,sum(F([t]))),params(F))
ERROR: MethodError: no method matching *(::NamedTuple{(:coeffs, :order),Tuple{Array{Float64,1},Nothing}}, ::Taylor1{Float64})
Closest candidates are:
  *(::Any, ::Any, ::Any, ::Any...) at operators.jl:502
  *(::Bool, ::Taylor1) at /Users/jesse/.julia/packages/TaylorSeries/cbisT/src/arithmetic.jl:209
  *(::Missing, ::Number) at missing.jl:97
  ...
Stacktrace:
 [1] _broadcast_getindex_evalf at ./broadcast.jl:578 [inlined]
 [2] _broadcast_getindex at ./broadcast.jl:551 [inlined]
 [3] getindex at ./broadcast.jl:511 [inlined]
 [4] copy at ./broadcast.jl:787 [inlined]
 [5] materialize at ./broadcast.jl:753 [inlined]
 [6] broadcast(::typeof(*), ::Array{NamedTuple{(:coeffs, :order),Tuple{Array{Float64,1},Nothing}},1}, ::LinearAlgebra.Transpose{Taylor1{Float64},Array{Taylor1{Float64},1}}) at ./broadcast.jl:707
 [7] * at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/adjtrans.jl:205 [inlined]
 [8] #850 at /Users/jesse/.julia/dev/Zygote/src/lib/array.jl:117 [inlined]
 [9] (::getfield(Zygote, Symbol("##2131#back#852")){getfield(Zygote, Symbol("##850#851")){Array{Float32,2},Array{Taylor1{Float64},1}}})(::Array{NamedTuple{(:coeffs, :order),Tuple{Array{Float64,1},Nothing}},1}) at /Users/jesse/.julia/dev/Zygote/src/lib/grad.jl:41
 [10] Dense at /Users/jesse/.julia/packages/Flux/iSuKc/src/layers/basic.jl:82 [inlined]
 [11] (::typeof(∂(λ)))(::Zygote.FillArray{NamedTuple{(:coeffs, :order),Tuple{Array{Float64,1},Nothing}},1}) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface2.jl:0
 [12] applychain at /Users/jesse/.julia/packages/Flux/iSuKc/src/layers/basic.jl:31 [inlined]
 [13] (::typeof(∂(Flux.applychain)))(::Zygote.FillArray{NamedTuple{(:coeffs, :order),Tuple{Array{Float64,1},Nothing}},1}) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface2.jl:0
 [14] applychain at /Users/jesse/.julia/packages/Flux/iSuKc/src/layers/basic.jl:31 [inlined]
 [15] (::typeof(∂(Flux.applychain)))(::Zygote.FillArray{NamedTuple{(:coeffs, :order),Tuple{Array{Float64,1},Nothing}},1}) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface2.jl:0
 [16] Chain at /Users/jesse/.julia/packages/Flux/iSuKc/src/layers/basic.jl:33 [inlined]
 [17] (::typeof(∂(λ)))(::Zygote.FillArray{NamedTuple{(:coeffs, :order),Tuple{Array{Float64,1},Nothing}},1}) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface2.jl:0
 [18] #7 at ./REPL[7]:1 [inlined]
 [19] (::typeof(∂(getfield(Main, Symbol("##7#8"))())))(::Int8) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface2.jl:0
 [20] (::getfield(Zygote, Symbol("##84#85")){Params,Zygote.Context,typeof(∂(getfield(Main, Symbol("##7#8"))()))})(::Int8) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface.jl:95
 [21] gradient(::Function, ::Params) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface.jl:44
 [22] top-level scope at none:0

Per the Zygote dev docs on Custom Types I suspect I need to provide methods for either a constructor or for interface on how Taylor1 are using getfield? Any tips on how to identify where this NamedTuple is resulting from?

jessebett avatar Mar 11 '19 18:03 jessebett

Oh, this error requries @nograd factorial fix from @willtebbutt

jessebett avatar Mar 11 '19 19:03 jessebett

Specifically, the dev docs describe how defining methods for adjoints on functions like height and width which provide a functional interface for getfield will address this. But Taylor1 type in TaylorSeries.jl don't define these functional getfields and instead just use a.coeffs directly to access these fields.

jessebett avatar Mar 11 '19 19:03 jessebett

You could always create a getproperty method that just calls getfield directly and implement a custom adjoint for that? Zygote has literal_getproperty as the method that you can add an adjoint for. See e.g. here. It doesn't appear to be documented, so if you could raise an issue regarding that, that would be great.

willtebbutt avatar Mar 11 '19 20:03 willtebbutt

@willtebbutt I'm a bit confused by what the above suggestion is, sorry. I'm trying to follow this strategy but Zygote is still throwing getfield errors, so I don't see where this is being used. Btw I also tried giving a custom adjoint for the getproperty.

Here's what I tried defining for the custom adjoint, for now, gradients are nothing:

@adjoint function getfield(t::Taylor1{T}, f::Val{:coeffs}) where {T<:Real}
  return getfield(t, f), function(Δ)
    return ((coeffs=nothing, order=nothing),)
  end
end
Zygote.refresh()    

But doing this I still get a method error, though I'm not really sure what the type signature of the method error is referring to. i.e. why is it (::getfield(Main, Symbol...)(::Taylor1...) and not (::getfield(::Taylor1...,Symbol...):

shift_taylor(a,n) = a + Taylor1(typeof(a),n)
tt = shift_taylor(1.,4)
f(x)=x.^2

gradient(()->f(tt).coeffs[2],tt)

gives error:

julia> gradient(()->f(tt).coeffs[2],tt)
ERROR: MethodError: no method matching (::getfield(Main, Symbol("##21#22")))(::Taylor1{Float64})
Closest candidates are:
  #21() at REPL[49]:1
Stacktrace:
 [1] macro expansion at /Users/jesse/.julia/dev/Zygote/src/compiler/interface2.jl:0 [inlined]
 [2] _forward(::Zygote.Context, ::getfield(Main, Symbol("##21#22")), ::Taylor1{Float64}) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface2.jl:4
 [3] _forward(::Function, ::Taylor1{Float64}) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface.jl:31
 [4] forward(::Function, ::Taylor1{Float64}) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface.jl:37
 [5] gradient(::Function, ::Taylor1{Float64}) at /Users/jesse/.julia/dev/Zygote/src/compiler/interface.jl:42
 [6] top-level scope at none:0

jessebett avatar Mar 13 '19 20:03 jessebett

Ah, so you need to add an adjoint for literal_getproperty, rather than either getproperty or getfield. If Taylor1 doesn't already implement a method of getproperty, you'll (presumably) need to implement a dummy one, along the lines of

getproperty(t::Taylor1, d::Symbol) = getfield(t, d)

The reason for the literal_getproperty rather than getproperty is that Zygote does something weird. I'm not entirely sure why this is the case, but presumably @MikeInnes would.

willtebbutt avatar Mar 14 '19 15:03 willtebbutt

Right, okay I thought the literal_getpropery was some weird Cholesky thing.

Another question, in the examples you linked it looks like it's actually dispatching on the value of the property being getproperty'd. But when I tested this outside of Zygote, just seeing if I can implement getproperty methods which depend on the property, it does't appear to work:

import Base.getproperty
function getproperty(t::Taylor1{T}, f::Val{:coeffs}) where {T<:Real}
    print("Is this method being called?")
    return getfield(x, Val(f))
end
Taylor1(1.,4).coeffs #Does not call above method

A method with your type signature works, but it looks like @MikeInnes has sensitivities which depend on the property.

jessebett avatar Mar 14 '19 16:03 jessebett

Ah yeah, I reckon that dispatch on the field is a Zygote-specific thing -- you're correct that getproperty just has access to the symbol (as far as I'm aware). See here for literal_getproperty definition. It literally (no pun intended) just calls getproperty by default.

As I say, I'm really not sure why this is a thing. This could probably do with being documented at some point...

willtebbutt avatar Mar 14 '19 16:03 willtebbutt

literal_getproperty is just a hack; we need constant propagation to work so that something like getproperty(1+2im, :re) can be type inferred, but it tends to give up after Zygote has made the code more complex. So instead we rewrite this to literal_getproperty(1+2im, Val(:re)), which forces the compiler to specialise.

This does make it harder to customise how getproperty works though. You'd probably just need to overload both getproperty and literal_getproperty together. At some point I'll dig through what's needed to get Taylors to work and either document it or make it easier.

MikeInnes avatar Mar 22 '19 11:03 MikeInnes

I ran into this issue and managed to create a MWE before I found the issue here

struct M
    a
end

Base.:*(a::M, b::M) = M(a.a*b.a)
Base.:+(a::M, b::M) = M(a.a+b.a)
Base.copy(m::M) = m
LinearAlgebra.adjoint(m::M) = m

function fun(a,b)
    c = M.(a)*M.(b)
    sum(c).a
end

fun(randn(2,2), randn(2,2))
Zygote.gradient(fun, randn(2,2), randn(2,2))

I Expected defining the following methods should solve the problem, but the problem persists

Zygote.@adjoint function *(a::M, b::M)
    M(a.a*b.a), function (d)
        @show d
        (M(d*b.a), M(d*a.a))
    end
end
Zygote.@adjoint function +(a::M, b::M)
    M(a.a+b.a), function (d)
        @show d
        (M(1), M(1))
    end
end

baggepinnen avatar Nov 08 '19 02:11 baggepinnen

The issue is that those adjoints don't apply when you multiply arrays of M together. So we end up multiplying adjoints of M with M there.

If M is a numerical-like type, I suggest making its adjoint another M, which will fix this.

MikeInnes avatar Nov 08 '19 10:11 MikeInnes

I am also impacted by this issue.

A quick fix for me was to stop deriving Number for my type (<: Number). It starts being considered as just a struct with a bunch of function and everything works properly.

(making the adjoint of the same type as my number-like implementation was not an option for me as my type has a number of properties that I don't want to see impacting the gradient computation)

nestordemeure avatar Mar 13 '20 19:03 nestordemeure

Is there's an update on this? I also ran into a similar problem with taking a gradient on NeuralNetwork in Surrogates.jl

MethodError: no method matching (::ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}})(::Tuple{Float64}) Closest candidates are: (::ChainRulesCore.ProjectTo{T})(::Integer) where T<:AbstractFloat at ~/.julia/packages/ChainRulesCore/a4mIA/src/projection.jl:172 (::ChainRulesCore.ProjectTo{<:Real})(::Complex) at ~/.julia/packages/ChainRulesCore/a4mIA/src/projection.jl:187 (::ChainRulesCore.ProjectTo{T})(::ChainRulesCore.AbstractZero) where T at ~/.julia/packages/ChainRulesCore/a4mIA/src/projection.jl:120 ...

pitipatw avatar May 04 '23 15:05 pitipatw

This kind of bug report really needs a MWE. I'd recommend starting a thread on Discourse, Slack or Zulip instead as the root issue may not even be related to Zygote itself.

ToucheSir avatar May 05 '23 02:05 ToucheSir