cuda-kat icon indicating copy to clipboard operation
cuda-kat copied to clipboard

Add wrappers (and builtins?) for more PTX instructions

Open eyalroz opened this issue 4 years ago • 0 comments

The following PTX instructions don't have wrapper functions (nor builtins:: templated functions where relevant). Add them!

  • [ ] lop3 - Logical operation on 3 operands using an immediate 3-parameter lookup table.
  • [ ] prefetching instructions?
  • [ ] cvt.pack
  • [ ] fns - find n'th bit set
  • [ ] Sub-32-bit dot product with accumulation: dp4a, dp2a for bytes and halfword, respectively.

eyalroz avatar Feb 21 '20 22:02 eyalroz