AMX-TMUL-Code-Samples
AMX-TMUL-Code-Samples copied to clipboard
AMX example clarification
trafficstars
Hi, Why do we configure 4 tiles (in total, i.e., i=0, i=1, i=2, i=3) here when the computation only needs 3? https://github.com/intel/AMX-TMUL-Code-Samples/blob/e4029b9184bac98ecb0c4472bc883bccc190933b/src/test-amxtile.c#L46
Yes, I would also like to know. Or is there something special about tmm0?