Enzyme.jl icon indicating copy to clipboard operation
Enzyme.jl copied to clipboard

error in minimal Lux example

Open ExpandingMan opened this issue 1 year ago • 3 comments

using Enzyme
using Random
using Lux, Optimisers, MLUtils

testmodel() = Chain(
    Dense(4=>8, relu),
    Dense(8=>8, relu),
    Dense(8=>2),
)

testdata() = (randn(4, 256), randn(2, 256))

    dl = DataLoader(testdata(), batchsize=6)

    ℓ = MSELoss()

    (X, y) = first(dl)

    mdl = testmodel()
    (θ, ψ) = Lux.setup(Xoshiro(999), mdl)

    s = Training.TrainState(mdl, θ, ψ, Adam(0.01))

    (∂θ, l, _, s′) = Training.compute_gradients(AutoEnzyme(), ℓ, (X, y), s)

Currently (0.13.10) gives

ERROR: TypeError: in typeassert, expected LLVM.GlobalVariable, got a value of type LLVM.PointerNull
Stacktrace:
  [1] check_ir!(job::GPUCompiler.CompilerJob{…}, errors::Vector{…}, imported::Set{…}, f::LLVM.Function, deletedfns::Vector{…})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/RmraO/src/compiler/validation.jl:460
  [2] check_ir!(job::GPUCompiler.CompilerJob{…}, errors::Vector{…}, mod::LLVM.Module)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/RmraO/src/compiler/validation.jl:382
  [3] check_ir
    @ ~/.julia/packages/Enzyme/RmraO/src/compiler/validation.jl:163 [inlined]
  [4] codegen(output::Symbol, job::GPUCompiler.CompilerJob{…}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:6447
  [5] codegen
    @ ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:6371 [inlined]
  [6] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:8642
  [7] _thunk
    @ ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:8642 [inlined]
  [8] cached_compilation
    @ ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:8683 [inlined]
  [9] thunkbase(ctx::LLVM.Context, mi::Core.MethodInstance, ::Val{…}, ::Type{…}, ::Type{…}, tt::Type{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Type{…}, ::Val{…}, ::Val{…})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:8815
 [10] #s2067#19122
    @ ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:8952 [inlined]
 [11]
    @ Enzyme.Compiler ./none:0
 [12] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core ./boot.jl:707
 [13] autodiff
    @ ~/.julia/packages/Enzyme/RmraO/src/Enzyme.jl:473 [inlined]
 [14] compute_gradients(ad::AutoEnzyme{…}, obj_fn::GenericLossFunction{…}, data::Tuple{…}, ts::Lux.Training.TrainState{…})
    @ LuxEnzymeExt ~/.julia/packages/Lux/VkHFW/ext/LuxEnzymeExt/training.jl:8

I know it may still not be expected for this to work, but this looked to me like it is probably an enzyme issue, so here it is.

ExpandingMan avatar Oct 17 '24 19:10 ExpandingMan

What is the Julia and all other package versions

wsmoses avatar Oct 17 '24 19:10 wsmoses

I restored the using statements above.

  • Enzyme latest main
  • julia 1.11.1
  • Lux 1.1.0
  • Optimisers 0.3.3
  • MLUtils 0.4.4

Should all be latest I think.

ExpandingMan avatar Oct 17 '24 22:10 ExpandingMan

What happens on Julia 1.10, enzyme doesn't fully support 1.11 yet

wsmoses avatar Oct 17 '24 22:10 wsmoses

With another big 1.11 patch [not complete but ongoing] I think the root issue here is fixed, if not reopen.

wsmoses avatar Oct 23 '24 06:10 wsmoses

Seems to work, thanks! However, I am getting the following two warnings

┌ Warning: `called_value(inst::CallBase)` is deprecated, use `called_operand(inst)` instead.
│   caller = check_ir!(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, errors::Vector{Tuple{String, Vector{Base.StackTraces.StackFrame}, Any}}, imported::Set{String}, f::LLVM.Function, deletedfns::Vector{LLVM.Function}) at validation.jl:501
└ @ Enzyme.Compiler ~/.julia/packages/Enzyme/BRtTP/src/compiler/validation.jl:501
┌ Warning: Mixed-Precision `matmul_cpu_fallback!` detected and Octavian.jl cannot be used for this set of inputs (C [Matrix{Float64}]: A [Matrix{Float32}] x B [Matrix{Float64}]). Falling back to generic implementation. This may be slow.
└ @ LuxLib.Impl ~/.julia/packages/LuxLib/pu6Tq/src/impl/matmul.jl:145

I am reopening and changing the title to reflect.

ExpandingMan avatar Oct 23 '24 14:10 ExpandingMan

Oh, I don't actually have permission to reopen. Please reopen if you think it's appropriate.

ExpandingMan avatar Oct 23 '24 14:10 ExpandingMan

Done, though also here feel free to just open a PR to fix

wsmoses avatar Oct 23 '24 16:10 wsmoses

┌ Warning: Mixed-Precision `matmul_cpu_fallback!` detected and Octavian.jl cannot be used for this set of inputs (C [Matrix{Float64}]: A [Matrix{Float32}] x B [Matrix{Float64}]). Falling back to generic implementation. This may be slow.
└ @ LuxLib.Impl ~/.julia/packages/LuxLib/pu6Tq/src/impl/matmul.jl:145

Just to clarify, enzyme isn't at fault for the 2nd warning. This is just a warning that you are mixing Float64 (see the inputs in your code) and Float32 (default params).

avik-pal avatar Nov 01 '24 20:11 avik-pal

warning resolved by https://github.com/EnzymeAD/Enzyme.jl/pull/2053

wsmoses avatar Nov 04 '24 05:11 wsmoses