AMDGPU.jl
AMDGPU.jl copied to clipboard
unsafe_copy3d! requires 2^4 alignment
As reported by HSA in CI on master. We should probably switch to a kernel copy for the portions which aren't aligned.
Is this still the case? (cc @jpsamaroo @pxl-th) Maybe worth checking as I saw the function got refactored in https://github.com/JuliaGPU/AMDGPU.jl/pull/374
hsa_amd_memory_async_copy_rect accepts hsa_pitched_ptr_s docs for which still say that alignment is required.
We may however, add hsa_amd_memory_async_copy for regular async copy.