Konrad Kusiak
Konrad Kusiak
This PR changes the `queue.fill()` and `cgh.fill()` implementation to make use of the native functions for a specific backend. It also unifies that implementation with the one for memset, since...
CI for UR PR: https://github.com/oneapi-src/unified-runtime/pull/1368
https://github.com/oneapi-src/unified-runtime/pull/1603
https://github.com/oneapi-src/unified-runtime/pull/1634
### Describe the bug The reproducible below which is a copy of [sycl/test-e2e/out_of_order_queue_status.cpp](https://github.com/intel/llvm/blob/sycl/sycl/test-e2e/Basic/out_of_order_queue_status.cpp), except that it uses `Q.memset()` instead of `Q.fill()` would fail with the following output on the post-commit...
This is a follow-up of https://github.com/oneapi-src/unified-runtime/pull/1412 which added the `isPowerOf2` condition to the OpenCL fill function. This is correct since `clEnqueueMemFillINTEL_fn` only accepts such patterns. What was not correct was...
HIP changes: - To match with the current behaviour of CUDA adapter, `EvBase` in HIP was moved to `device` and `getElapsedTime` function now handles the profiling events' synchronization. Also, we...
This PR changes the `queue.fill()` implementation to make use of the native functions for a specific backend. It also unifies that implementation with the one for memset, since it is...
### Problem Description It is my understanding that by passing `hipExtAnyOrderLaunch` as the last argument to this entry point: [hipExtModuleLaunchKernel](https://rocm.docs.amd.com/projects/HIP/en/latest/doxygen/html/group___module.html#ga73d0c5f72869e258aa4899a829d9645c), I could achieve asynchronous execution of the kernels that I'm...