Enzyme.jl icon indicating copy to clipboard operation
Enzyme.jl copied to clipboard

JIT error

Open swilliamson7 opened this issue 2 years ago • 12 comments

Opening a new issue with a weird error I'm seeing when trying to use Enzyme. The output is:

JIT session error: In graph -jitted-objectbuffer, section __TEXT,__text: relocation target "l___unnamed_142" at address 0x7a5f72675 is out of range of Page21 fixup at 0x6dbb1d774 (_diffejulia_time_integration_debug_3453_inner_1wrap, 0x6dbb18308 + 0x546c)
JIT session error: In graph -jitted-objectbuffer, section __TEXT,__text: relocation target "l___unnamed_142" at address 0x96ca4e675 is out of range of Page21 fixup at 0x8a25f9774 (_diffejulia_time_integration_debug_5640_inner_1wrap, 0x8a25f4308 + 0x546c)
JIT session error: In graph -jitted-objectbuffer, section __TEXT,__text: relocation target "l___unnamed_142" at address 0xb38b72675 is out of range of Page21 fixup at 0xa6e71d774 (_diffejulia_time_integration_debug_5645_inner_1wrap, 0xa6e718308 + 0x546c)
ERROR: LoadError: LLVM error: Failed to materialize symbols: { (main, { _diffejulia_time_integration_debug_5645_inner_1wrap }) }
Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/LLVM/Od0DH/src/executionengine/utils.jl:32 [inlined]
  [2] lookup
    @ ~/.julia/packages/LLVM/Od0DH/src/orcv2.jl:228 [inlined]
  [3] lookup
    @ ~/.julia/packages/Enzyme/vjxwv/src/compiler/orcv2.jl:262 [inlined]
  [4] _link(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, ::Tuple{LLVM.Module, String, Nothing, DataType})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/vjxwv/src/compiler.jl:9688
  [5] cached_compilation
    @ ~/.julia/packages/Enzyme/vjxwv/src/compiler.jl:9744 [inlined]
  [6] (::Enzyme.Compiler.var"#475#476"{DataType, DataType, DataType, Enzyme.API.CDerivativeMode, Tuple{Bool, Bool}, Int64, Bool, Bool, UInt64, DataType})(ctx::LLVM.Context)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/vjxwv/src/compiler.jl:9806
  [7] JuliaContext(f::Enzyme.Compiler.var"#475#476"{DataType, DataType, DataType, Enzyme.API.CDerivativeMode, Tuple{Bool, Bool}, Int64, Bool, Bool, UInt64, DataType})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/kwkKA/src/driver.jl:58
  [8] #s260#474
    @ ~/.julia/packages/Enzyme/vjxwv/src/compiler.jl:9761 [inlined]
  [9] var"#s260#474"(FA::Any, A::Any, TT::Any, Mode::Any, ModifiedBetween::Any, width::Any, ReturnPrimal::Any, ShadowInit::Any, World::Any, ABI::Any, ::Any, #unused#::Type, #unused#::Type, #unused#::Type, tt::Any, #unused#::Type, #unused#::Type, #unused#::Type, #unused#::Type, #unused#::Type, #unused#::Any)
    @ Enzyme.Compiler ./none:0
 [10] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any})
    @ Core ./boot.jl:582
 [11] autodiff
    @ ~/.julia/packages/Enzyme/vjxwv/src/Enzyme.jl:207 [inlined]
 [12] autodiff
    @ ~/.julia/packages/Enzyme/vjxwv/src/Enzyme.jl:236 [inlined]
 [13] autodiff
    @ ~/.julia/packages/Enzyme/vjxwv/src/Enzyme.jl:222 [inlined]
 [14] run_enzyme(#unused#::Type{Float32}, P::Parameter)
    @ Main ~/Documents/GitHub/ShallowWaters.jl/src/for_enzymefolks.jl:1896
 [15] #run_enzyme#48
    @ ~/Documents/GitHub/ShallowWaters.jl/src/for_enzymefolks.jl:1859 [inlined]
 [16] top-level scope
    @ ~/Documents/GitHub/ShallowWaters.jl/src/for_enzymefolks.jl:1902
in expression starting at /Users/swilliamson/Documents/GitHub/ShallowWaters.jl/src/for_enzymefolks.jl:1902

I don't have a MWE, it's been difficult for me to narrow down where this is coming from. The lines of code that is refers to just lead to the autodiff call, which isn't too helpful. In the code I am trying to call autodiff on a structure of structures, S in autodiff(Reverse, time_integration_debug, Duplicated(S, dS)), which maybe Enzyme has trouble with? I'm just initializing dS with deepcopy, maybe this is a bad idea?

Without copying all my code, the other tiny thing I noticed when trying to make a MWE is that, inside time_integration_debug I call a function PV! that seems to trigger the error. However, if I just cut and paste the lines from PV! into time_integration_debug the code does run. In other words, it's the difference between

### other lines of code 

PV!(Diag,S)

return nothing 

or

### same lines of code as before 

# The following lines are just the definition of PV!
@unpack q,dvdx,dudy,h_q = Diag.Vorticity
@unpack f_q,ep = S.grid

m,n = size(q)
@boundscheck (m,n) == size(f_q) || throw(BoundsError())
@boundscheck (m+2,n+2) == size(dvdx) || throw(BoundsError())
@boundscheck (m+2+ep,n+2) == size(dudy) || throw(BoundsError())
@boundscheck (m,n) == size(h_q) || throw(BoundsError())

@inbounds for j ∈ 1:n
      for i ∈ 1:m
            q[i,j] = (f_q[i,j] + dvdx[i+1,j+1] - dudy[i+1+ep,j+1]) / h_q[i,j]
      end
end

return nothing 

Because these two chunks are doing the exact same thing, I'm a little lost about what's going on and how to proceed. I'm hoping this is all caused by me doing something silly, and I really appreciate any insight you can give!

swilliamson7 avatar Aug 21 '23 20:08 swilliamson7

What's the versioninfo() of your machine?

vchuravy avatar Aug 21 '23 20:08 vchuravy

julia> versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin21.5.0)
  CPU: 10 × Apple M1 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

And if it helps, I've been running on Enzyme v0.11.7

swilliamson7 avatar Aug 21 '23 20:08 swilliamson7

Sigh, I think we are still running with RTDYLD instead of JITLink.

I will take a look.

cc:@gbaraldi

vchuravy avatar Aug 21 '23 20:08 vchuravy

Is it something to do with my computer and not the code, or a combination of both?

swilliamson7 avatar Aug 21 '23 20:08 swilliamson7

Could you try on 1.10? I believe there we should use the jitlink machinery no? And it might be the arm64-apple arch playing oddly with the JIT, we had this in base julia and it might be showing up here as well.

gbaraldi avatar Aug 21 '23 22:08 gbaraldi

Just to confirm, you mean Julia v1.10 correct? So the beta version?

swilliamson7 avatar Aug 21 '23 22:08 swilliamson7

Yeah the beta version of 1.10

vchuravy avatar Aug 21 '23 22:08 vchuravy

Running on Julia 1.10 seems to have made it go away! Does this mean that if I want to use Enzyme with this code, I'll need to be using a beta version of Julia...?

swilliamson7 avatar Aug 21 '23 23:08 swilliamson7

Potentially, 1.10 has some big improvements to enzyme overall (better errors as well)

gbaraldi avatar Aug 21 '23 23:08 gbaraldi

Okay, for now I'll run on that Julia version then. Thanks for your help!

swilliamson7 avatar Aug 21 '23 23:08 swilliamson7

@swilliamson7 what OS and machine are you on (an M1 mac)?

wsmoses avatar Sep 14 '23 22:09 wsmoses

@wsmoses Yeah it's an M1 mac, this is the version info:

Julia Version 1.10.0-beta2
Commit a468aa198d0 (2023-08-17 06:27 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 10 × Apple M1 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores

swilliamson7 avatar Sep 15 '23 16:09 swilliamson7

Closing as resolved in later julia

wsmoses avatar May 07 '24 14:05 wsmoses