Matthew Nicely

Results 113 comments of Matthew Nicely

@caelunshun sorry for missing this. Are you still having issues?

@vedantroy are you still having issues?

@vedantroy can you check in the latest 1.6.1 release?

@Corendos thanks for reporting the bug. @nvmbreughe can you create a NVBug next week? @Corendos would you be interested in connecting to discuss your use cases?

@yanqinz2 did we resolve this issue? If yes, version number?

Please checkout our new [Python interface](https://github.com/NVIDIA/cutlass/tree/main/python) We have an [example](https://github.com/NVIDIA/cutlass/tree/main/examples/python) creating a PyTorch extension

Hi @mengsoso, this feature is not on our roadmap, but we welcome community contributions!

As mentioned above `nvidia-cutlass` has been deprecated. For `nvidia-cutlass-dsl`, the goal is to have 1:1 between mapping between tags and wheels. There may be the rare case we only need...

FA3 is specifically optimized for Hopper tensor cores. There are instructions that won't run on other HW. You might consider cuDNN or FlashInfer while Tri works on FA4