SYCLomatic
SYCLomatic copied to clipboard
Extension to cg scan
#include <stdio.h>
#include <cooperative_groups.h>
#include <cooperative_groups/scan.h>
namespace cg = cooperative_groups;
__global__ void kernel() {
auto thread_block = cg::this_thread_block();
auto tile = cg::tiled_partition<8>(thread_block);
unsigned int val = cg::inclusive_scan(tile, tile.thread_rank());
printf("%u: %u\n", tile.thread_rank(), val);
}
Thanks
Hi, We supported the tiled_partition<32>(thread_block) inclusive_scan and cooperative_scan migration. You can ref https://github.com/oneapi-src/SYCLomatic/pull/1495/files#diff-454c9fb17660120ad6cf4ce877dd77435506b17bea94bdeb8ffdf2d68d799e74
But tiled_partition<8> scan function migration has not been supported yet. You can ref the Intel experimental extension to migrate the partition group to the non-uniform group (like fixed_size_group). https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_non_uniform_groups.asciidoc
Thanks.