cpu_gemm_opt icon indicating copy to clipboard operation
cpu_gemm_opt copied to clipboard

Performance of mkl's sgemm

Open wjc404 opened this issue 6 years ago • 3 comments

Powerful designs and beautiful introductions~ How about the 1-thread avx2 sgemm performance of Intel MKL (MKL_CBWR="AVX2") on Intel Xeon Gold 6142 ? I'm just curious about that..

wjc404 avatar Aug 18 '19 16:08 wjc404

I've tested the 1-thread performance of avx2 SGEMM on i7-9800x (fixed at 3.0 GHz, 4 ch ddr4 2400, mesh 2.4 GHz), which shares the same architecture with Xeon Gold 6142.

Routine..................................Performance(peak)

Your_SGEMM(default_parms)..89-90 GFLOPS

Intel_MKL(2019 update 4)........89-90 GFLOPS (estimated)

OpenBLAS(0.3.8-dev)..................91-92 GFLOPS

Theoretical...................................96 GFLOPS

wjc404 avatar Jan 15 '20 17:01 wjc404

@wjc404 thanks for your interest. BTW, what's the MNK parameter of the test above?

carlushuang avatar Jan 20 '20 14:01 carlushuang

I called gemm_driver without parameters, so it should have used the default parameters optimized on Xeon Gold 6142. The performances listed are peak performances (usually occur at dimensions above 4000). The speed of MKL was tested by a benchmarking program in my repository.

wjc404 avatar Jan 22 '20 11:01 wjc404