Enzyme.jl icon indicating copy to clipboard operation
Enzyme.jl copied to clipboard

segfault on 1.11

Open Krastanov opened this issue 6 months ago • 3 comments

using SciMLSensitivity
using DifferentialEquations
using LinearAlgebra
import Enzyme
function fiip(du, u, p, t)
	du[1] = dx = p[1] * u[1] - p[2] * u[1] * u[2]
	du[2] = dy = -p[3] * u[2] + p[4] * u[1] * u[2]
end
p = [1.5, 1.0, 3.0, 1.0];
u0 = [1.0; 1.0];
dp = similar(p)
prob = ODEProblem(fiip, u0, (0.0, 10.0), p)
sol = solve(prob, Tsit5())
loss(u0, p) = norm(solve(
	ODEProblem(fiip, u0, (0.0, 10.0), p),
	Tsit5(); saveat = 0.1
).u[end])
Enzyme.autodiff(
	Enzyme.set_runtime_activity(Enzyme.Reverse),
	loss,
	Enzyme.Active,
	Enzyme.Const(u0),
	Enzyme.Duplicated(p, dp)
)

segfaults with

[121322] signal 11 (1): Segmentation fault
in expression starting at REPL[12]:1
jl_gc_pool_alloc_inner at /cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/src/gc.c:1329

version info:

(tmp) pkg> st
Status `/tmp/Project.toml`
  [0c46a032] DifferentialEquations v7.16.1
  [7da242da] Enzyme v0.13.41
  [1ed8b502] SciMLSensitivity v7.78.0

julia> versioninfo()
Julia Version 1.11.5
Commit 760b2e5b739 (2025-04-14 06:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 32 × AMD Ryzen 9 7950X 16-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver4)
Threads: 32 default, 1 interactive, 16 GC (on 32 virtual cores)
Environment:
  JULIA_EDITOR = vim

Krastanov avatar May 12 '25 01:05 Krastanov

if instead I do ODEProblem{true}, then I do not get a segfault, just an error:

julia> loss(u0, p) = norm(solve(
               ODEProblem{true}(fiip, u0, (0.0, 10.0), p),
               Tsit5(); saveat = 0.1
       ).u[end])
loss (generic function with 1 method)

julia> loss(u0, p)
1.3748320374105525

julia> Enzyme.autodiff(
               Enzyme.set_runtime_activity(Enzyme.Reverse),
               loss,
               Enzyme.Active,
               Enzyme.Const(u0),
               Enzyme.Duplicated(p, dp)
       )
ERROR: Conversion of boxed type Any is not allowed
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] convert(::Type{LLVM.LLVMType}, typ::Type; allow_boxed::Bool)
    @ LLVM.Interop ~/.julia/packages/LLVM/xTJfF/src/interop/base.jl:77
  [3] convert
    @ ~/.julia/packages/LLVM/xTJfF/src/interop/base.jl:71 [inlined]
  [4] lower_convention(functy::Type, mod::LLVM.Module, entry_f::LLVM.Function, actualRetType::Type, RetActivity::Type, TT::Union{…}, run_enzyme::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/3VNOP/src/compiler.jl:2720
  [5] codegen(output::Symbol, job::GPUCompiler.CompilerJob{…}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/3VNOP/src/compiler.jl:4286
  [6] codegen
    @ ~/.julia/packages/Enzyme/3VNOP/src/compiler.jl:3450 [inlined]
  [7] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/3VNOP/src/compiler.jl:5528
  [8] _thunk
    @ ~/.julia/packages/Enzyme/3VNOP/src/compiler.jl:5528 [inlined]
  [9] cached_compilation
    @ ~/.julia/packages/Enzyme/3VNOP/src/compiler.jl:5580 [inlined]
 [10] thunkbase(mi::Core.MethodInstance, World::UInt64, FA::Type{…}, A::Type{…}, TT::Type, Mode::Enzyme.API.CDerivativeMode, width::Int64, ModifiedBetween::NTuple{…} where N, ReturnPrimal::Bool, ShadowInit::Bool, ABI::Type, ErrIfFuncWritten::Bool, RuntimeActivity::Bool, edges::Vector{…})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/3VNOP/src/compiler.jl:5691
 [11] thunk_generator(world::UInt64, source::LineNumberNode, FA::Type, A::Type, TT::Type, Mode::Enzyme.API.CDerivativeMode, Width::Int64, ModifiedBetween::NTuple{…} where N, ReturnPrimal::Bool, ShadowInit::Bool, ABI::Type, ErrIfFuncWritten::Bool, RuntimeActivity::Bool, self::Any, fakeworld::Any, fa::Type, a::Type, tt::Type, mode::Type, width::Type, modifiedbetween::Type, returnprimal::Type, shadowinit::Type, abi::Type, erriffuncwritten::Type, runtimeactivity::Type)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/3VNOP/src/compiler.jl:5876
 [12] autodiff
    @ ~/.julia/packages/Enzyme/3VNOP/src/Enzyme.jl:485 [inlined]
 [13] autodiff(::EnzymeCore.ReverseMode{…}, ::typeof(loss), ::Type{…}, ::EnzymeCore.Const{…}, ::EnzymeCore.Duplicated{…})
    @ Enzyme ~/.julia/packages/Enzyme/3VNOP/src/Enzyme.jl:524
 [14] top-level scope
    @ REPL[14]:1
Some type information was truncated. Use `show(err)` to see complete types.

ODEProblem{true} works fine on julia 1.10

Krastanov avatar May 12 '25 13:05 Krastanov

@Krastanov we recently had a julia 1.11 GC fix land, can you check this still errs?

wsmoses avatar Jun 23 '25 18:06 wsmoses

Still seems to fail on 1.11.5

julia> versioninfo()
Julia Version 1.11.5
Commit 760b2e5b739 (2025-04-14 06:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen 7 5700G with Radeon Graphics
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 16 default, 0 interactive, 8 GC (on 16 virtual cores)

(tmp) pkg> st
Status `/tmp/Project.toml`
  [0c46a032] DifferentialEquations v7.16.1
  [7da242da] Enzyme v0.13.51
  [1ed8b502] SciMLSensitivity v7.86.1
  [37e2e46d] LinearAlgebra v1.11.0

Krastanov avatar Jun 23 '25 20:06 Krastanov

okay another gc fix landed, can you retry again?

wsmoses avatar Jul 02 '25 15:07 wsmoses

On 1.11 it simply errors out, no segfaults:

julia> loss(u0, p) = norm(solve(
               ODEProblem(fiip, u0, (0.0, 10.0), p),
               Tsit5(); saveat = 0.1
       ).u[end])
loss (generic function with 1 method)

julia> Enzyme.autodiff(
               Enzyme.set_runtime_activity(Enzyme.Reverse),
               loss,
               Enzyme.Active,
               Enzyme.Const(u0),
               Enzyme.Duplicated(p, dp)
       )
ERROR: Enzyme execution failed.
Enzyme: Non-constant keyword argument found for Tuple{UInt64, typeof(Core.kwcall), EnzymeCore.Duplicated{@NamedTuple{saveat::Float64}}, typeof(EnzymeCore.EnzymeRules.augmented_primal), EnzymeCore.EnzymeRules.RevConfigWidth{1, true, true, (false, true, false, true, true, false), true, false}, EnzymeCore.Const{typeof(DiffEqBase.solve_up)}, Type{EnzymeCore.Duplicated{Any}}, EnzymeCore.Duplicated{ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, Vector{Float64}, ODEFunction{true, SciMLBase.AutoSpecialize, typeof(fiip), UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing, Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, @NamedTuple{}}, SciMLBase.StandardODEProblem}}, EnzymeCore.Const{Nothing}, EnzymeCore.Duplicated{Vector{Float64}}, EnzymeCore.Duplicated{Vector{Float64}}, EnzymeCore.Const{Tsit5{typeof(OrdinaryDiffEqCore.trivial_limiter!), typeof(OrdinaryDiffEqCore.trivial_limiter!), Static.False}}}

If I manually specify ODEProblem{true} I get the result ((nothing, nothing),) which is wrong.

On 1.12-beta4 Enzyme does not install.

On 1.10.10 it fails the same way as on 1.11 (not a segfault, just a ERROR: Enzyme cannot deduce type)

On 1.10.10 if I specify ODEProblem{true} I still get ((nothing, nothing),).

The sefgault is fixed. I am not sure whether this issue should stay open to reference that the autodiff fails though (with an error message, not a segfault). Please proceed corresponding to your policy for the issue tracker. Happy to report the autodiff failure as a separate issue.

Krastanov avatar Jul 02 '25 16:07 Krastanov

so the nothing's actually are correct. Duplicated means the shadow (aka dp) has the gradient modified in place, and a nothing is returned [since its modified in place not newly returned].

The other issues, of kwarg and cannot deduce types, can you open separate issues for

wsmoses avatar Jul 02 '25 16:07 wsmoses