iree icon indicating copy to clipboard operation
iree copied to clipboard

Optimize narrow-M `mmt4d` ukernel tile functions

Open bjacob opened this issue 1 year ago • 0 comments

We have mmt4d ukernel tile functions for a bunch of narrow-M cases, but they have been added as naive truncations of the general case. Often, that's fine. Sometimes, that results in convoluted and inefficient ukernels. Thinking particularly of the int8 ukernels on x86-64.

bjacob avatar Feb 14 '24 02:02 bjacob