oneAPI.jl icon indicating copy to clipboard operation
oneAPI.jl copied to clipboard

Kernel crash in driver

Open michel2323 opened this issue 2 years ago • 5 comments

Here is the dump as requested in https://github.com/JuliaGPU/KernelAbstractions.jl/pull/312 .

As the index suggests, it is the 8th kernel that is executed. And with a size of 108kb it is substantially larger than the previous ones with around 3kb.

I tried it on an integrated GPU box and on the actual GPU. dump_8.img.zip

michel2323 avatar Jul 07 '22 20:07 michel2323

Sadly the code uses Float64 which my test system does not support. But after compiling IGC with debug info/functionality, I'm getting an IR verification error during back-end compilation:

Instruction does not dominate all uses!
  %value_phi191.off0715753 = phi i8 [ %value_phi45.off0506, %L30922 ], [ 1, %L30911.thread ]
  %i1trunc60 = trunc i8 %value_phi191.off0715753 to i1
PHI nodes not grouped at top of basic block!
  %value_phi43505 = phi i64 [ %value_phi33, %L19878 ], [ 1, %L31303 ]
label %L19900

I'm assuming that this is related, and am reducing the SPIR-V code now.

maleadt avatar Jul 08 '22 09:07 maleadt

Sadly the code uses Float64

We should make the code base type independent.

michel2323 avatar Jul 08 '22 14:07 michel2323

That would be useful. In addition, could you report the LLVM IR too? i.e., the output of @device_code, or @device_code_llvm dump_module=true.

maleadt avatar Jul 08 '22 19:07 maleadt

When doing

@device_code_llvm dump_module=true auglag_linelimit_two_level_alternative_ka(device, 32, data.nline*32)(
    Val(mod.n), data.nline, mod.line_start,
    info.inner, par.max_auglag, par.mu_max, par.scale,
    sol.u_curr, sol.v_curr, sol.z_curr, sol.l_curr, sol.rho,
    par.shift_lines, mod.membuf, data.YffR, data.YffI, data.YftR, data.YftI,
    data.YttR, data.YttI, data.YtfR, data.YtfI,
    data.FrVmBound, data.ToVmBound, data.FrVaBound, data.ToVaBound,
    dependencies=Event(device)
)

it still crashes without any output of an expression. I'll work on the Float32 port now.

michel2323 avatar Jul 11 '22 16:07 michel2323

Can you save input from here then: https://github.com/JuliaGPU/GPUCompiler.jl/blob/c687cae9510c42cea4c7449731e4a8075f0dc955/src/spirv.jl#L141=

maleadt avatar Jul 11 '22 17:07 maleadt