Performance of mkl's sgemm
Powerful designs and beautiful introductions~ How about the 1-thread avx2 sgemm performance of Intel MKL (MKL_CBWR="AVX2") on Intel Xeon Gold 6142 ? I'm just curious about that..
I've tested the 1-thread performance of avx2 SGEMM on i7-9800x (fixed at 3.0 GHz, 4 ch ddr4 2400, mesh 2.4 GHz), which shares the same architecture with Xeon Gold 6142.
Routine..................................Performance(peak)
Your_SGEMM(default_parms)..89-90 GFLOPS
Intel_MKL(2019 update 4)........89-90 GFLOPS (estimated)
OpenBLAS(0.3.8-dev)..................91-92 GFLOPS
Theoretical...................................96 GFLOPS
@wjc404 thanks for your interest. BTW, what's the MNK parameter of the test above?
I called gemm_driver without parameters, so it should have used the default parameters optimized on Xeon Gold 6142. The performances listed are peak performances (usually occur at dimensions above 4000). The speed of MKL was tested by a benchmarking program in my repository.