cuda-kat icon indicating copy to clipboard operation
cuda-kat copied to clipboard

Support the "SIMD"-like intrinsics

Open eyalroz opened this issue 4 years ago • 0 comments

CUDA offers many functions:

https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__SIMD.html

for working with multiple 1-byte and 2-byte values packed into the native 4-byte integers.

We should offer both explicit access to these, which would be better structured and not a heap of idiosyncratic names (perhaps via the kat::array type? some other way?)

We should also check our existing code, to see when specializations are in order which would ensure we benefit from these instructions (e.g. in sequence operations or collaboration primitives).

eyalroz avatar Jun 23 '20 10:06 eyalroz