OpenBLAS icon indicating copy to clipboard operation
OpenBLAS copied to clipboard

SGEMM performance opportunity on POWER8 VSX

Open edelsohn opened this issue 9 years ago • 1 comments

SGEMM performance on POWER8 VSX has some opportunities for improvement.

For cblas_sgemm_googlenet,

M = 192, N = 3136, K = 576 shows the slowest performance and M = 320, N = 196, K = 1440 seems to have the greatest opportunity for improvement

cblas_sgemm_googlenet.cpp.txt

For cblas_sgemm_vggnet,

M = 256, N = 3136, K = 2304 shows the slowest performance and M = 512, N = 196, K = 4680 seems to have the greatest opportunity for improvement

cblas_sgemm_vggnet.cpp.txt

relative to alternate optimized BLAS implementations.

edelsohn avatar Aug 23 '16 19:08 edelsohn