llvm icon indicating copy to clipboard operation
llvm copied to clipboard

joint_matrix_bf16_fill_k_cache_unroll.cpp failing after 90ba33099cbb

Open jsji opened this issue 1 year ago • 3 comments

Describe the bug

https://github.com/intel/llvm/actions/runs/7672516896/job/20913879599

FAIL: SYCL :: Matrix/XMX8/joint_matrix_bf16_fill_k_cache_unroll.cpp (111 of 120)
******************** TEST 'SYCL :: Matrix/XMX8/joint_matrix_bf16_fill_k_cache_unroll.cpp' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 10
/__w/llvm/llvm/toolchain/bin//clang++   -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/XMX8/joint_matrix_bf16_fill_k_cache_unroll.cpp -mllvm -inline-threshold=2000 -ffp-model=precise -o /__w/llvm/llvm/build-e2e/Matrix/XMX8/Output/joint_matrix_bf16_fill_k_cache_unroll.cpp.tmp.out -DMANUAL_UNROLL
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/XMX8/joint_matrix_bf16_fill_k_cache_unroll.cpp -mllvm -inline-threshold=2000 -ffp-model=precise -o /__w/llvm/llvm/build-e2e/Matrix/XMX8/Output/joint_matrix_bf16_fill_k_cache_unroll.cpp.tmp.out -DMANUAL_UNROLL
# note: command had no output on stdout or stderr
# RUN: at line 11
env ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/XMX8/Output/joint_matrix_bf16_fill_k_cache_unroll.cpp.tmp.out
# executed command: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/XMX8/Output/joint_matrix_bf16_fill_k_cache_unroll.cpp.tmp.out
# .---command stdout------------
# | Incorrect result in matrix. i: 0, j: 0, Ref: -0.687566, Val: 0.110868, Diff: 0.7984[34](https://github.com/intel/llvm/actions/runs/7672516896/job/20913879599#step:21:35), Epsilon: 0.1
# | DONE for size 256
# | GOPS is 59.46[44](https://github.com/intel/llvm/actions/runs/7672516896/job/20913879599#step:21:45) Gop/s
# `-----------------------------
# error: command failed with exit status: 1

--

Pass without https://github.com/llvm/llvm-project/commit/90ba33099cbb. https://github.com/intel/llvm/actions/runs/7689888115/job/20953236115

jsji avatar Jan 29 '24 15:01 jsji

@YuriPlyakhin @dkhaldi Can you have a look? Thanks.

Looks like Matrix tests are sensitive to GEP optimizations.

jsji avatar Jan 29 '24 15:01 jsji

@jsji , @dkhaldi , I reproduced the same on PVC. I'm investigating...

YuriPlyakhin avatar Jan 31 '24 21:01 YuriPlyakhin

Fix was merged to IGC. As soon as new GPU driver with the fix is used in testing and test starts to pass, xfail can be removed and this issue closed.

YuriPlyakhin avatar Feb 14 '24 22:02 YuriPlyakhin

Fix was merged to OCL CPU driver. XFAIL can be removed on next OCL CPU driver uplift.

Nuullll avatar Aug 13 '24 01:08 Nuullll