Jack Kosaian comments

Results 15 comments of


                                            Jack Kosaian

[QST] A100 double-precision Tensor Cores ?

Yes. For an example, you can modify the example [14_ampere_tf32_tensorop_gemm](https://github.com/NVIDIA/cutlass/blob/master/examples/14_ampere_tf32_tensorop_gemm/ampere_tf32_tensorop_gemm.cu) to use double precision. To do so, you can change ElementAccumulator, ElementInputA, and ElementInputB on [these lines](https://github.com/NVIDIA/cutlass/blob/master/examples/14_ampere_tf32_tensorop_gemm/ampere_tf32_tensorop_gemm.cu#L170) to be of...

02_layout_algebra doc bug

Yes, you're correct. This is being fixed in https://github.com/NVIDIA/cutlass/pull/1451

Supports for s4 and s8 GEMM on Python?

The CUTLASS Python interface does support s8 GEMMs. Unit tests that show examples of using these are [here](https://github.com/NVIDIA/cutlass/blob/main/test/python/cutlass/gemm/gemm_s8_sm80.py) and [here](https://github.com/NVIDIA/cutlass/blob/main/test/python/cutlass/gemm/gemm_s8_sm90.py). The CUTLASS Python interface does not currently support s4. You...

Jack Kosaian

[QST] A100 double-precision Tensor Cores ?

02_layout_algebra doc bug

Supports for s4 and s8 GEMM on Python?

[BUG] CUTLASS python emit utility for sm90 GEMM produces code with incorrect syntax/missing header files

[QST] What's the concept of sk regions in streamK?

[QST] What's the concept of sk regions in streamK?

[QST] EVT `Sqrt`

[BUG] Python `EVT` `Pytorch` Emitter Broken

Fix C++17 version detection in helper_macros.hpp

[QST]question about cutlass epilogue customization