Update CUTLASS-based sparse semi-structured GEMM
Stack from ghstack (oldest at bottom):
- -> #117124
cc @nikitaved @pearu @cpuhrsch @amjames @bhosmer @jcaip @ptrblck
:link: Helpful Links
:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/117124
- :page_facing_up: Preview Python docs built from this PR
- :page_facing_up: Preview C++ docs built from this PR
- :question: Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours
Note: Links to docs will display an error until the docs builds have been completed.
:x: 1 New Failure, 2 Unrelated Failures
As of commit 8a1aa4b503fd9e2cec8b9419280f23b0d52b8be3 with merge base 2d4aa91a108218483761c12206a0debee69f4968 ():
NEW FAILURE - The following job has failed:
- pull / linux-focal-cuda11.8-py3.10-gcc9 / build (gh)
Process completed with exit code 1.
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
- pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / build (gh)
Process completed with exit code 1. - pull / linux-jammy-cuda11.8-cudnn8-py3.8-clang12 / build (gh)
Process completed with exit code 1.
This comment was automatically generated by Dr. CI and updates every 15 minutes.
This PR updates CUTLASS-bases sparse semi-structured GEMM implementation: it replaces use of SparseGemmRowBroadcast GEMM variation with using recently added EVT epilogue support for sparse GEMM - former was pretty much a hack, hopefully to be removed from CUTLASS, while EVT epilogues provide much more general approach to adding bias, applying activation functions and fusing other operations with sparse GEMM. PR is to be merged only after we upgrade CUTLASS pin in PyTorch either to commit adding EVT epilogues for sparse GEMM into CUTLASS, or a newer one.
@alexsamardzic - We'll want to update to the next version of CUTLASS before we can pull this in. Do you know when the planned release is? Is the required change already part of a release?
@alexsamardzic - We'll want to update to the next version of CUTLASS before we can pull this in. Do you know when the planned release is? Is the required change already part of a release?
The change is included in CUTLASS 3.4.0, released yesterday.
Merged into main along with CUTLASS update to 3.4.1, through PR 120434.