cutlass
cutlass copied to clipboard
[FEA] Add cuTensorMapEncodeTiled to CudaHostAdapter
Summary
We are currently working on integrating a fp8 scaled matmul kernel written using Cutlass into PyTorch. PyTorch has the constraint that it can be linked against the cuda driver api. There is one symbol/direct call to a cuda driver api cuTensorMapEncodeTiled that is causing issues.
We have a temporary workaround here: https://github.com/pytorch/pytorch/pull/125204#discussion_r1618787335
There was a suggestion to add this symbol to CudaHostAdapter so as to add one more layer of indirection. This would greatly aid in PyTorch in its utilization of Cutlass.
@kerrmudgeon
Curious if there is any update here?
3.5.1 will ship with this before end of week
@drisspg please verify and close? I think this is done, but follow up work is required in https://github.com/NVIDIA/cutlass/issues/1624?
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
Closing as done with #1700