grouped_gemm icon indicating copy to clipboard operation
grouped_gemm copied to clipboard

PyTorch bindings for CUTLASS grouped GEMM.

Results 3 grouped_gemm issues
Sort by recently updated
recently updated
newest added

Hi! I encountered an `ImportError` while running the example, and fixed it by changing `from grouped_gemm import permute, unpermute` to `from grouped_gemm.ops import permute, unpermute` Pardon me if this isn't...

Hi! This PR is an attempt to use the `cublasGemmGroupedBatchedEx` api [introduced in cublas 12.5](https://developer.nvidia.com/blog/introducing-grouped-gemm-apis-in-cublas-and-more-performance-updates/) to calculate the grouped gemm. And the code has passed `op_test.py`. There is an potential...