RAJA generic device policies
Since there is a discrepancy between RAJA/HIP and SYCL ordering it may be convenient to define raja device policy alias to simplify transitions between backends. It could be implemented as alias accordingly to the backend enabled. This would reduce IF/DEFS in user code.
| RAJA Execution Policies | CUDA/HIP Execution Policies | SYCL Execution Policies |
|---|---|---|
| device_exec<BLOCK_SIZE> | cuda/hip_exec<BLOCK_SIZE> | sycl_exec<WORK_GROUP_SIZE> |
| device_launch_t | cuda/hip_launch_t | sycl_launch_t |
| device_global_size_x_direct<nx_threads> | cuda/hip_global_size_x_direct<nx_threads> | sycl_global_2<WORK_GROUP_SIZE> |
| device_global_size_y_direct<ny_threads> | cuda/hip_global_size_y_direct<ny_threads> | sycl_global_1<WORK_GROUP_SIZE> |
| device_global_size_z_direct<nz_threads> | cuda/hip_global_size_z_direct<nz_threads> | sycl_global_0<WORK_GROUP_SIZE> |
| device_global_thread_x | cuda_global_thread_x | sycl_global_item_2 |
| device_global_thread_y | cuda_global_thread_y | sycl_global_item_1 |
| device_global_thread_z | cuda_global_thread_z | sycl_global_item_0 |
| device_thread_x_direct | cuda/hip_thread_x_direct | sycl_local_2_direct |
| device_thread_y_direct | cuda/hip_thread_y_direct | sycl_local_1_direct |
| device_thread_z_direct | cuda/hip_thread_z_direct | sycl_local_0_direct |
| device_thread_x_loop | cuda/hip_thread_x_loop | sycl_local_2_loop |
| device_thread_y_loop | cuda/hip_thread_y_loop | sycl_local_1_loop |
| device_thread_z_loop | cuda/hip_thread_z_loop | sycl_local_0_loop |
| device_block_x_direct | cuda/hip_block_x_direct | sycl_group_2_direct |
| device_block_y_direct | cuda/hip_block_y_direct | sycl_group_1_direct |
| device_block_z_direct | cuda/hip_block_z_direct | sycl_group_0_direct |
| device_block_x_loop | cuda/hip_block_x_loop | sycl_group_2_loop |
| device_block_y_loop | cuda/hip_block_y_loop | sycl_group_1_loop |
| device_block_z_loop | cuda/hip_block_z_loop | sycl_group_0_loop |
Teams and threads are ordered differently between CUDA/Sycl. One option is to consider a policy or alias to enable more explicit naming convention. CUDA-like, or SYCL-like or layout_left{right}
@rhornung67 @MrBurmark , this is what I had in mind.
I think the indices for sycl_global_* need to be reversed; i.e., device_global_size_x_direct should correspond to sycl_global_2, not zero, etc....
I think the indices for
sycl_global_*need to be reversed; i.e.,device_global_size_x_directshould correspond tosycl_global_2, not zero, etc....
Ah thats right! Thanks --fixed!