Srinivasa Ravi
Results
2
issues of
Srinivasa Ravi
This change adds the `NVVM_IntrinsicLoweringOp` class in `NVVMOps.td` to simplify Ops which lower using intrinsics. Some Ops have been updated to show its usage.
mlir:llvm
mlir
This change adds NVVM intrinsics and NVPTX codegen for the `tensormap.replace` PTX instructions. Tests are added in `tensormap_replace.ll`, `tensormap_replace_sm_100a.ll`, and `tensormap_replace_sm_103a.ll` and tested through `ptxas-13.0`. PTX Spec Reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-tensormap-replace
backend:NVPTX
llvm:ir