trove icon indicating copy to clipboard operation
trove copied to clipboard

Handle warp configurations for 2D and 3D blocks

Open sjperkins opened this issue 6 years ago • 0 comments

I noticed that the warp_id calculation within much of the code assumes a 1D thread block. Is this an assumed limitation when using trove?

int warp_id = threadIdx.x & WARP_MASK;

To cater for 2D and 3D blocks this should probably be

int warp_id = ((threadIdx.z*blockDim.y + threadIdx.y)*blockDimx.x + threadIdx.x) & WARP_MASK;

This PR makes the above change, but I'm not sure if its worth going further if the actual algorithm internals depend on a 1D thread block?

sjperkins avatar Jan 29 '19 10:01 sjperkins