Alex Brown

Results 9 issues of Alex Brown

Reduces data copying and comparisons when reading arrays, maps, and enums.

NoCI

Script to remove kernels from a logic file that are not used by any of the tuned sizes

NoCI

Alternative implementation of the 2-tile algorithm that does DP tiles first and SK tiles after. This method should have a small boost in performance.

This update fixes the case when alpha=0 by ensuring that A/B matrices are not read and main loop does not run. Also added a new small test case with stream-k...

GSU=0 should disable all GSU code. This change updates some sections of code that were still generating GSU-related code when GSU was disabled.

Allow wavegroup to be less than 4 in stream-k kernels. This change updates the partials and fixup code to tkae number of waves into account. Added new test cases to...