FBGEMM icon indicating copy to clipboard operation
FBGEMM copied to clipboard

Gate invalid triton autotune configs in AOTInductor for GFX95+

Open JChunX opened this issue 2 months ago • 2 comments

Summary: Saw lowering error when lowering models on MI350X with FP8 PyTorch: P1966277532

Issue arises from lack of instruction support for BLOCK_K <= 64 when matrix_instr_nonkdim=16 on GFX95+ Hardware. This was previously patched for FP8 Triton in D81180838, but now error is showing up in AOTI codepaths with FP8 PyTorch.

Differential Revision: D83383625

JChunX avatar Sep 26 '25 23:09 JChunX

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
Latest commit 66d3d30b9b65aacd0cad80d894f087da6c32daa9
Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68d7264a1236520008f4811f
Deploy Preview https://deploy-preview-4940--pytorch-fbgemm-docs.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify[bot] avatar Sep 26 '25 23:09 netlify[bot]

@JChunX has exported this pull request. If you are a Meta employee, you can view the originating diff in D83383625.

facebook-github-bot avatar Sep 26 '25 23:09 facebook-github-bot