FBGEMM
FBGEMM copied to clipboard
Gate invalid triton autotune configs in AOTInductor for GFX95+
Summary: Saw lowering error when lowering models on MI350X with FP8 PyTorch: P1966277532
Issue arises from lack of instruction support for BLOCK_K <= 64 when matrix_instr_nonkdim=16 on GFX95+ Hardware. This was previously patched for FP8 Triton in D81180838, but now error is showing up in AOTI codepaths with FP8 PyTorch.
Differential Revision: D83383625
Deploy Preview for pytorch-fbgemm-docs ready!
| Name | Link |
|---|---|
| Latest commit | 66d3d30b9b65aacd0cad80d894f087da6c32daa9 |
| Latest deploy log | https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68d7264a1236520008f4811f |
| Deploy Preview | https://deploy-preview-4940--pytorch-fbgemm-docs.netlify.app |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify project configuration.
@JChunX has exported this pull request. If you are a Meta employee, you can view the originating diff in D83383625.