AMDGPU.jl icon indicating copy to clipboard operation
AMDGPU.jl copied to clipboard

HSA memory test hang the GPU in CI

Open luraess opened this issue 3 years ago • 0 comments

Testing the AMDGPU.Mem.unsafe_copy3d! function (#220) may hang the GPU in the BuildKite CI. No issue is observed outside of CI. A current workaround is to add an operation (tested sleep, println or now assigning the signal to sig) before the call to amd_memory_async_copy_rect https://github.com/JuliaGPU/AMDGPU.jl/blob/cfaade146977594bf18e14b285ee3a9c84fbc7f2/src/memory.jl#L394-L397

A potential cause may be some instability wrt HSASignal. This may potentially relate to #208 as well.

luraess avatar Apr 13 '22 08:04 luraess