unified-runtime icon indicating copy to clipboard operation
unified-runtime copied to clipboard

[OpenCL] Modify fill emulation to work for patterns which are not powers of 2

Open keyradical opened this issue 1 year ago • 0 comments

This is a follow-up of https://github.com/oneapi-src/unified-runtime/pull/1412 which added the isPowerOf2 condition to the OpenCL fill function. This is correct since clEnqueueMemFillINTEL_fn only accepts such patterns.

What was not correct was the later logic for emulating filling on the host and copying it to the destination ptr. It assumed that the pattern is greater than 128 bytes but after adding the above isPowerOf2 condition, it could also execute for smaller patterns which are simply not powers of 2.

So this PR fixes my introduced bugs, intel/llvm CI: https://github.com/intel/llvm/pull/13779

keyradical avatar May 14 '24 10:05 keyradical