Valentin Churavy
Valentin Churavy
A MWE would be great, this error means that one of Enzyme's assumptions got violated.
I am kinda wondering if people would like this to be in EnzymeCore, but I don't necessarily want to take on the dependency of ChainRulesCore there. But it depends on...
We should probably handle exceptional Control-Flow better, gradients could propagate through PhiC nodes.
We will need to look at `@device_code_dump`, but I am wondering if your libm function replacement is inserting calls to pointers. Given that we are looking at isfinite
So I immediatly see: ``` @0 = private unnamed_addr constant [7808 x i8] c"Enzyme compilation failed.\0ACurrent scope: \0A; Function Attrs: mustprogress willreturn\0Adefine void @preprocess_julia_kernel__5774_inner5({ i8 addrspace(1)*, i64, [1 x i64],...
ANd the verifier complains about: ``` ; │││││┌ @ /home/jgreener/.julia/dev/CUDA/src/device/intrinsics/math.jl:190 within `#isfinite` call void inttoptr (i64 140245794510672 to void (i8*)*)(i8* getelementptr inbounds ([7808 x i8], [7808 x i8]* @0, i64...
So the short answer is: ``` No augmented forward pass found for __nv_isfinited at context: %37 = call i32 @__nv_isfinited(double %35) #6, !dbg !145\0A\0AStacktrace: [1] #isfinite @ ~/.julia/dev/CUDA/src/device/intrinsics/math.jl:190 [2] macro...
the `julia_error callback` could query the target (from the data-layout?) and if it's nvptx use GPUCompiler's report exception https://github.com/JuliaGPU/GPUCompiler.jl/blob/e5cd575dd44bec8d2914f3ce2cd1ae83dcd9ac91/src/irgen.jl#L220
Shouldn't it be something like https://fwd.gymni.ch/QgKyvm With a global const in C it can do store load forwarding.
In discussion with Billy the right answer is to treat `inttoptr` here equivalent to a extern global in C.