Manish Gupta

Results 30 comments of Manish Gupta

We do have [singlestage mma](https://github.com/NVIDIA/cutlass/blob/master/include/cutlass/gemm/threadblock/mma_singlestage.h) pipeline for GEMMs. Do we have some use cases for T4 where single stage wins over 2-staged pipeline?

OptionalAttr: Present OR Not Present. Follow `bypassL1` which uses the same datatype Present : Allowed to use TF32 lowering given that the data type for the F32 (update the verifier)...

In progress here: https://reviews.llvm.org/D130294

The next steps here are to use the added OptionalAttr `tf32Enabled` and enum `MmaSyncF32Lowering` to enable support for `TF32x3` a.k.a. F32 emulation through TensorCores.

**Summary** - Scheduling Shared Memory loads (`ldsm`) and math (`mma.sync`) operations shows performance gains. We are now at 70us for the GEMM we are measuring (3456x1024x2048xf16). - We reduced it...

(1) Support GEMM Pipelining without Epilogue Peeling is done and merged. PR [#10388](https://github.com/iree-org/iree/pull/10388) on supporting GEMM pipelining without epilogue peeling (Unpeeled Epilogue). - Unpeeled epilogue is shorter and tighter, but...

For a separate issue [#9394](https://github.com/iree-org/iree/issues/9394) and [LLVM patch](https://reviews.llvm.org/D130294), we have added an optional attribute `tf32Enabled` to allow the IR to be aware of MmaSyncOp input data type is TF32 and...

Progress on bullet (2): Handles native sizes for n`nvgpu.mma.sync` and `nvgpu.ldmatrix` are ready to start merging into llvm/llvm-project and iree-org/iree. (i) [llvm/llvm-project](https://github.com/manishucsd/llvm-project/compare/41b3beb7614ae7da833bf8330103ddd0a86e528b...manishucsd:llvm-project:llvm_unrolling_mma_ops#diff-b440b410c31f183ebdcce48e43d7fe089095fddb531232ef23b6508046479b9dR196), and - iree-org/iree-llvm-fork integrate process is running behind,...

We have pushed the changes to improve Ampere Tensor core mma.sync performance for F16 and F32. We are now tracking performance issues and further improvements in smaller PRs. I think...