tblis
tblis copied to clipboard
Is the matrix operation of this library as fast as MKL
The matrix multiplication primitives are essentially the same as in BLIS; you can find lots of performance graphs for BLIS here. It is typically as fast or faster than OpenBLAS and about 10% slower than MKL. Of course, TBLIS's forte is tensor operations, which are not natively available in MKL and are much slower than TBLIS when implemented using tensor transpose+matrix multiplication (see here).