iree icon indicating copy to clipboard operation
iree copied to clipboard

[gfx950][mxfp4] Verify the state of current heuristics

Open Muzammiluddin-Syed-ECE opened this issue 1 month ago • 0 comments

Use amdsharktuner to collect performance data on the effect of knobs such as workgroup thread count, subgroup count, tile size, etc. on the best performance at various shapes of interest. This will help us verify the reliability of our existing heuristics. The intention is to compare it to the performance obtained when copying the configs of a handwritten assembly kernel and note whether we can do better.

M, N, K/2, K/32
512,1024,8192,512
512,16384,8192,512
512,53248,8192,512
1024,16384,8192,512
1024,1024,8192,512
1024,53248,8192,512
2048,1024,8192,512
2048,16384,8192,512
2048,53248,8192,512
512,16384,26624,1664
1024,16384,26624,1664
2048,16384,26624,1664

Muzammiluddin-Syed-ECE avatar Nov 26 '25 01:11 Muzammiluddin-Syed-ECE