tract icon indicating copy to clipboard operation
tract copied to clipboard

AVX512F optimized matrix multiplication

Open kali opened this issue 2 years ago • 1 comments

Hey @tgolsson!

The AVX512F question is back on the radar with another team evaluating performance on this architecture. IIRC, you have already done the heavy lifting on this front, with decent results. Would it be possible to put these kernels back in play? I can help with finalizing integration myself, but would hate to duplicate what you've already done...

Thanks a lot.

kali avatar Aug 12 '22 10:08 kali

Hey! Yeah I'd done a bunch of work, some kernels... I'll try to clean it up and push. It's not in a shippable state but will happily contribute what I've done. :) I'll either do it later today or most likely during the weekend/Monday.

tgolsson avatar Aug 12 '22 13:08 tgolsson