rocSOLVER icon indicating copy to clipboard operation
rocSOLVER copied to clipboard

Low performance of xSY/HEGST

Open rasolca opened this issue 8 months ago • 3 comments

sy/hegst significantly slower in rocSOLVER compared to cuSOLVER

(Tested ROCsolver 3.26.2)

size Performance (GF/s)
GH200 MI250x MI300a
1024 (typical size we use) 1270 29 24
10240 16000 1750 1613

@saadrahim

rasolca avatar Apr 28 '25 15:04 rasolca

Hi @rasolca. Internal ticket has been created for investigation. Thanks!

ppanchad-amd avatar Apr 29 '25 13:04 ppanchad-amd

Hi @rasolca, can you provide any reproducer or sample workload that you are using to compare the performance?

zichguan-amd avatar Apr 30 '25 14:04 zichguan-amd

We use itype = rocblas_eform_ax and both uplo = rocblas_fill_lower and uplo = rocblas_fill_upper.

I used https://github.com/eth-cscs/DLA-Future miniapp_gen_to_std with parameters (--matrix-size <size> --block-size <size>) that fallback to a single lapack/cusolver/rocsolver call.

Anyway as long as the matrix are valid inputs matrix elements has no impact on performance.

rasolca avatar May 06 '25 14:05 rasolca

This issue has been migrated to: https://github.com/ROCm/rocm-libraries/issues/1676

Imported to ROCm/rocm-libraries

ammallya avatar Sep 18 '25 18:09 ammallya