Zygote.jl
Zygote.jl copied to clipboard
@allowscalar triggers try/catch error
I'm not sure if I should be filing this here or in CUDA, but doing temporary @allowscalar
s lead to errors when taking a gradient.
MWE:
using Zygote, CUDA
CUDA.allowscalar(false)
f(x) = CUDA.@allowscalar x[3]
gradient(f, cu(randn(10, 5)))
and the resulting error
ERROR: Compiling Tuple{typeof(task_local_storage), var"#1#2"{CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, Symbol, Bool}: try/catch is not supported.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] instrument(ir::IRTools.Inner.IR)
@ Zygote ~/.julia/packages/Zygote/Lw5Kf/src/compiler/reverse.jl:121
[3] #Primal#20
@ ~/.julia/packages/Zygote/Lw5Kf/src/compiler/reverse.jl:202 [inlined]
[4] Zygote.Adjoint(ir::IRTools.Inner.IR; varargs::Nothing, normalise::Bool)
@ Zygote ~/.julia/packages/Zygote/Lw5Kf/src/compiler/reverse.jl:315
[5] _generate_pullback_via_decomposition(T::Type)
@ Zygote ~/.julia/packages/Zygote/Lw5Kf/src/compiler/emit.jl:101
[6] #s2989#1184
@ ~/.julia/packages/Zygote/Lw5Kf/src/compiler/interface2.jl:28 [inlined]
[7] var"#s2989#1184"(::Any, ctx::Any, f::Any, args::Any)
@ Zygote ./none:0
[8] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any, N} where N)
@ Core ./boot.jl:571
[9] macro expansion
@ ~/.julia/packages/GPUArrays/3sW6s/src/host/indexing.jl:74 [inlined]
[10] _pullback
@ ./REPL[2]:1 [inlined]
[11] _pullback(ctx::Zygote.Context, f::typeof(f), args::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
@ Zygote ~/.julia/packages/Zygote/Lw5Kf/src/compiler/interface2.jl:0
[12] _pullback(f::Function, args::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
@ Zygote ~/.julia/packages/Zygote/Lw5Kf/src/compiler/interface.jl:34
[13] pullback(f::Function, args::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
@ Zygote ~/.julia/packages/Zygote/Lw5Kf/src/compiler/interface.jl:40
[14] gradient(f::Function, args::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
@ Zygote ~/.julia/packages/Zygote/Lw5Kf/src/compiler/interface.jl:75
[15] top-level scope
@ REPL[11]:1
[16] top-level scope
@ ~/.julia/packages/CUDA/YpW0k/src/initialization.jl:52
This may or may not be related to https://github.com/FluxML/Zygote.jl/issues/1070
This should not be going into task_local_storage
. What's the behaviour if you instead you CUDA.allowscalar(false)
globally
instead? its fine. but doing both gives the same errors
using Zygote, CUDA
CUDA.allowscalar(true)
f(x) = CUDA.@allowscalar x[3]
gradient(f, cu(randn(10, 5)))
# above error still
g(x) = x[3]
gradient(g, cu(randn(10, 5)))
# fine
The task_local_storage
call is coming from https://github.com/JuliaGPU/GPUArrays.jl/blob/bb9ca6d1f11e82a1d495cb9cf39cee9c215491e0/src/host/indexing.jl#L72-L78.
Oh odd, it was changed several months prior (5 months according to blame). One can Zygote.@nograd task_local_storage
and that should be sufficient I think.
The error has changed to
julia> using Zygote, CUDA
julia> CUDA.allowscalar(false)
julia> f(x) = CUDA.@allowscalar x[3]
f (generic function with 1 method)
julia> gradient(f, cu(randn(10, 5)))
ERROR: Scalar indexing is disallowed.
Invocation of setindex! resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] assertscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/lojQM/src/GPUArraysCore.jl:87
[3] setindex!(xs::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, v::Float32, I::Int64)
@ GPUArrays ~/.julia/packages/GPUArrays/fqD8z/src/host/indexing.jl:17
[4] #383
@ ~/.julia/packages/Zygote/AS0Go/src/lib/array.jl:51 [inlined]
[5] #2451#back
@ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
[6] Pullback
@ ~/.julia/packages/Zygote/AS0Go/src/tools/builtins.jl:12 [inlined]
[7] Pullback
@ ~/.julia/packages/GPUArraysCore/lojQM/src/GPUArraysCore.jl:109 [inlined]
[8] ad_pullback
@ ~/.julia/packages/Zygote/AS0Go/src/compiler/chainrules.jl:258 [inlined]
[9] task_local_storage_pullback
@ ~/.julia/packages/ChainRules/ajkp7/src/rulesets/Base/base.jl:261 [inlined]
[10] ZBack
@ ~/.julia/packages/Zygote/AS0Go/src/compiler/chainrules.jl:206 [inlined]
[11] macro expansion
@ ~/.julia/packages/GPUArraysCore/lojQM/src/GPUArraysCore.jl:108 [inlined]
[12] Pullback
@ ./REPL[3]:1 [inlined]
[13] (::Zygote.var"#60#61"{typeof(∂(f))})(Δ::Float32)
@ Zygote ~/.julia/packages/Zygote/AS0Go/src/compiler/interface.jl:45
[14] gradient(f::Function, args::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
@ Zygote ~/.julia/packages/Zygote/AS0Go/src/compiler/interface.jl:97
[15] top-level scope
@ REPL[4]:1
[16] top-level scope
@ ~/.julia/packages/CUDA/Ey3w2/src/initialization.jl:52
It seems that the local @allowscalar
is ignored in the pullback
The problem is that @allowscalar
is defined thusly, meaning we can't use dispatch to capture the call and differentiate through the task_local_storage
callback. The only solution seems to be writing a rule for task_local_storage(f, k, v)
, and I can't wrap my brain around what that would look like.