rocm_sdk_builder icon indicating copy to clipboard operation
rocm_sdk_builder copied to clipboard

pytorch cpu very slow

Open 2eQTu opened this issue 8 months ago • 0 comments

During evaluation of #224 rdna4/gfx12 support, it seems python pytorch is very slow. At least as measured by the bundled pytorch_cpu_vs_gpu_simple_benchmark.sh benchmark.

The test matrix size is only 200x200, so to avoid any issue with setup overhead skewing numbers I also tested with a larger 2000x2000 matrix.

test rocm_sdk_builder 633 WIP pytorch nightly whl
CPU - 200x200 14.94 sec 0.076 sec
GPU - 200x200 FAIL 0.680 sec
CPU - 2000x2000 160.12 sec 12.11 sec
GPU - 2000x2000 FAIL 0.238 sec

2eQTu avatar Apr 05 '25 00:04 2eQTu