libxsmm icon indicating copy to clipboard operation
libxsmm copied to clipboard

LImit K unrolling in amx gemm kernel

Open egeor opened this issue 2 years ago • 1 comments
trafficstars

The amx gemm kernels now fully unroll K. This results in code buffer size issues, thus for K >= 4096 we fallback to avx512 code gen. We can limit the unrolling of the K loop for large K values and still get amx code.

egeor avatar Nov 14 '23 20:11 egeor

Addendum: It also seems that we have undefined behavior when running out of code buffer space. Sometimes we gracefully exit with NULL pointer return by code gen, sometime we just exit and crash.

alheinecke avatar Nov 14 '23 23:11 alheinecke

Closed by PR #868

egeor avatar Feb 24 '24 08:02 egeor