oneMKL
oneMKL copied to clipboard
[DFT][MKLGPU] Tests can fail on PVC
Summary
The MKLGPU backend tests can fail on PVC.
Version
Using the tip of develop as of today (https://github.com/oneapi-src/oneMKL/commit/6923d402d5bccba9ae1966062bc5a277fc74776c).
Environment
Running on PVC ( GPU Max 1100 1.3) with the oneAPI base toolkit 2024.2.0. OS is Ubuntu 22.04. apt level-zero package versions:
- level-zero: 1.16.15-881~22.04
- level-zero-dev: 1.16.15-881~22.04
- intel-level-zero-gpu: 1.3.30049.10-950~22.04
Steps to reproduce
cmake -Bbuild-pvc -GNinja .
cd build-pvc
ninja
ctest --output-on-failure
Observed behavior
Full log: log_pvc.txt The tests failing all seem to be 2D. Short extract:
[ RUN ] ComputeTestSuite/ComputeTests_in_place_COMPLEX.COMPLEX_SINGLE_in_place_buffer/sizes_4x4_fwd_strides_0_7_1_bwd_strides_0_5_1_batches_2_Intel_R__Data_Center_GPU_Max_1100
Mismatching results: actual = (2.32784,-0.862237) vs. reference = (-0.0695089,0.350374)
relative error = 7.52116 absolute error = 2.68658 relative bound = 9.53674e-05 absolute bound = 1.55578e-05
at position 2, 0, 0
at indices 10, 8
Mismatching results: actual = (1.28088,-0.619282) vs. reference = (2.32784,-0.862237)
relative error = 0.432961 absolute error = 1.07478 relative bound = 9.53674e-05 absolute bound = 1.55578e-05
at position 2, 1, 0
at indices 11, 9
Mismatching results: actual = (0.626577,1.75821) vs. reference = (1.28088,-0.619282)
relative error = 1.7332 absolute error = 2.46588 relative bound = 9.53674e-05 absolute bound = 1.55578e-05
at position 2, 2, 0
at indices 12, 10
Note the BLAS failures are reported in a separate issue: https://github.com/oneapi-src/oneMKL/issues/600
Expected behavior
The tests should pass.