cutlass
cutlass copied to clipboard
[FEA] Add cuTensorMapEncodeTiled to CudaHostAdapter
Summary
We are currently working on integrating a fp8 scaled matmul kernel written using Cutlass into PyTorch. PyTorch has the constraint that it can be linked against the cuda driver api. There is one symbol/direct call to a cuda driver api cuTensorMapEncodeTiled
that is causing issues.
We have a temporary workaround here: https://github.com/pytorch/pytorch/pull/125204#discussion_r1618787335
There was a suggestion to add this symbol to CudaHostAdapter so as to add one more layer of indirection. This would greatly aid in PyTorch in its utilization of Cutlass.