Kai Londenberg
Kai Londenberg
### Bug description When running the provided code as a standalone executable, a CUDA illegal memory access is reported. Using compute-sanitizer, I could pinpoint this to an illegal shared memory...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #121492 * #124928 * __->__ #124930 * #124929 Fixes cutlass_utils.get_max_alignment() which was so far not checking the alignment properly. Basically the method so...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #124928 * #125406 This diff makes sure that a custom exception is thrown when no valid choices remain during autotuning. This allows...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #124928 * __->__ #125406 Enable nonzero workspace and Cutlass StreamK for Inductor Cutlass GEMM ops. This is a simpler rewrite of my original...