joint_matrix_bf16_fill_k_cache_unroll.cpp failing after 90ba33099cbb
Describe the bug
https://github.com/intel/llvm/actions/runs/7672516896/job/20913879599
FAIL: SYCL :: Matrix/XMX8/joint_matrix_bf16_fill_k_cache_unroll.cpp (111 of 120)
******************** TEST 'SYCL :: Matrix/XMX8/joint_matrix_bf16_fill_k_cache_unroll.cpp' FAILED ********************
Exit Code: 1
Command Output (stdout):
--
# RUN: at line 10
/__w/llvm/llvm/toolchain/bin//clang++ -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/XMX8/joint_matrix_bf16_fill_k_cache_unroll.cpp -mllvm -inline-threshold=2000 -ffp-model=precise -o /__w/llvm/llvm/build-e2e/Matrix/XMX8/Output/joint_matrix_bf16_fill_k_cache_unroll.cpp.tmp.out -DMANUAL_UNROLL
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/XMX8/joint_matrix_bf16_fill_k_cache_unroll.cpp -mllvm -inline-threshold=2000 -ffp-model=precise -o /__w/llvm/llvm/build-e2e/Matrix/XMX8/Output/joint_matrix_bf16_fill_k_cache_unroll.cpp.tmp.out -DMANUAL_UNROLL
# note: command had no output on stdout or stderr
# RUN: at line 11
env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/XMX8/Output/joint_matrix_bf16_fill_k_cache_unroll.cpp.tmp.out
# executed command: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/XMX8/Output/joint_matrix_bf16_fill_k_cache_unroll.cpp.tmp.out
# .---command stdout------------
# | Incorrect result in matrix. i: 0, j: 0, Ref: -0.687566, Val: 0.110868, Diff: 0.7984[34](https://github.com/intel/llvm/actions/runs/7672516896/job/20913879599#step:21:35), Epsilon: 0.1
# | DONE for size 256
# | GOPS is 59.46[44](https://github.com/intel/llvm/actions/runs/7672516896/job/20913879599#step:21:45) Gop/s
# `-----------------------------
# error: command failed with exit status: 1
--
Pass without https://github.com/llvm/llvm-project/commit/90ba33099cbb. https://github.com/intel/llvm/actions/runs/7689888115/job/20953236115
@YuriPlyakhin @dkhaldi Can you have a look? Thanks.
Looks like Matrix tests are sensitive to GEP optimizations.
@jsji , @dkhaldi , I reproduced the same on PVC. I'm investigating...
Fix was merged to IGC. As soon as new GPU driver with the fix is used in testing and test starts to pass, xfail can be removed and this issue closed.
Fix was merged to OCL CPU driver. XFAIL can be removed on next OCL CPU driver uplift.