Single-pass scan kernel template
This PR provides an implementation of the single-pass scan algorithm as a kernel template.
@adamfidel could you please rebase this branch from current main branch state?
I do think it would be best if we better understand the compiler warning messages regarding generic address space operations we are seeing before merging.
For context if any other reviewers have seen something similar before:
warning: Adding 25 occurrences of additional control flow due to presence of generic address space operations
in function ...
I do think it would be best if we better understand the compiler warning messages regarding generic address space operations we are seeing before merging.
For context if any other reviewers have seen something similar before:
warning: Adding 25 occurrences of additional control flow due to presence of generic address space operations in function ...
I've found that all of these warnings are stemming from the call to sycl::joint_reduce and sycl::joint_inclusive_scan. I am passing in a raw pointer from a local_accessor so I thought that changing it to a decorated sycl::multi_ptr would fix it, but it did not. I still need to track this down.
Thanks all for the reviews!
@danhoeflinger and @dmitriy-sobolev, I will create GH issues to address the next steps that you have mentioned in your approvals.