alpaka icon indicating copy to clipboard operation
alpaka copied to clipboard

alignment of CUDA/HIP shared memory ?

Open fwyzard opened this issue 9 months ago • 0 comments

GetDynSharedMem<T, BlockSharedMemDynUniformCudaHipBuiltIn>::getMem(acc) is defined as:

https://github.com/alpaka-group/alpaka/blob/9b15e664d103c581020aa5285171b67483eb5c59/include/alpaka/block/shared/dyn/BlockSharedMemDynUniformCudaHipBuiltIn.hpp#L38-L46

  1. if the concern is that the memory may not be aligned enough for T, why not declare it as
extern __shared__ T shMem[];

?

  1. if the concern is that the memory may not be aligned enough for any type, why not declare it as
extern __shared__ __attribute__((aligned(16))) char shMem[];

?

  1. do you think it could be helpful to declare this alignment, with
__device__ static auto getMem(BlockSharedMemDynUniformCudaHipBuiltIn const&) __attribute__((assume_aligned(16)));
 -> T*

?

fwyzard avatar May 24 '24 08:05 fwyzard