compute-runtime
compute-runtime copied to clipboard
Build (tests) fail with 22.04 LTS / LLVM 13, when using system opencl-clang
Setup / deps
Latest release versions of everything, built from source:
- GMMlib:
- intel-gmmlib-22.1.6
- SPIRV Headers / Tools:
- sdk-1.3.216.0
- vc-intrinsics:
- v0.5.0
- IGC:
- igc-1.0.11485
- https://github.com/intel/intel-graphics-compiler/commit/12c99343388eba6e6275856b25e0fa8978585dfb cherry-picked to fix build config issue https://github.com/intel/intel-graphics-compiler/issues/248 with latest IGC release
- level-zero:
- v1.8.1
- compute-runtime:
- 22.28.23726
Issue
Tests run during compute-runtime build work fine, when I use (EOL) version of Ubuntu:
- Ubuntu 21.10:
- LLVM 12
- opencl-clang 12.0
- llvm-spirv 12.0
- GCC 11.2
But when I use latest LTS version with its newer LLVM:
- Ubuntu 22.04:
- LLVM 13
- opencl-clang 13.0
- llvm-spirv 13.0
- GCC 11.2
Large number of tests for wide variety or HW fail like this:
[3932/3934] Running utility command for run_dg2_0_ocl_tests
FAILED: target_unit_tests/xe_hpg_core/dg2/CMakeFiles/run_dg2_0_ocl_tests.util
cd /home/nobody/source/compute-runtime/build/bin && echo Running igdrcl_tests 2x4x5 in /home/nobody/source/compute-runtime/build/bin && /usr/bin/cmake -E remove_directory /home/nobody/source/compute-runtime/build/bin/opencl/dg2/0/cl_cache && /usr/bin/cmake -E make_directory /home/nobody/source/compute-runtime/build/bin/opencl/dg2/0/cl_cache && echo Cmd line: /home/nobody/source/compute-runtime/build/bin/igdrcl_tests --product dg2 --slices 2 --subslices 4 --eu_per_ss 5 --gtest_repeat=1 --gtest_shuffle --gtest_random_seed=0 --disable_default_listener --rev_id 0 && /home/nobody/source/compute-runtime/build/bin/igdrcl_tests --product dg2 --slices 2 --subslices 4 --eu_per_ss 5 --gtest_repeat=1 --gtest_shuffle --gtest_random_seed=0 --disable_default_listener --rev_id 0
Running igdrcl_tests 2x4x5 in /home/nobody/source/compute-runtime/build/bin
Cmd line: /home/nobody/source/compute-runtime/build/bin/igdrcl_tests --product dg2 --slices 2 --subslices 4 --eu_per_ss 5 --gtest_repeat=1 --gtest_shuffle --gtest_random_seed=0 --disable_default_listener --rev_id 0
product family: dg2 (1270)
enable SIGALRM handler: 1
set timeout to: 45
enable SIGSEGV handler: 1
enable SIGABRT handler: 1
Iteration: 1. random_seed: 5121
/home/nobody/source/compute-runtime/opencl/test/unit_test/kernel/kernel_tests.cpp:490: Failure
Value of: pKernelInfo->getArgDescriptorAt(0).isReadOnly()
Actual: false
Expected: true
[ FAILED ][ DG2 ][ 5121 ] KernelFromBinaryTests.givenArgumentDeclaredAsConstantWhenKernelIsCreatedThenArgumentIsMarkedAsReadOnly
========================
== DG2 ULTs FAILED ==
========================
Tests run: 12487
Tests passed: 12263
Tests skipped: 223
Tests failed: 1
Tests disabled: 2
Time elapsed: 3175 ms
========================
[ FAILED ][ DG2 ][ 5121 ] KernelFromBinaryTests.givenArgumentDeclaredAsConstantWhenKernelIsCreatedThenArgumentIsMarkedAsReadOnly
Because IGC supports LLVM versions 11, 12, 13 and 14: https://github.com/intel/intel-graphics-compiler/projects?type=classic
And your releases use LLVM 11, and tests with LLVM 12 seems to work fine, I guess this is LLVM 13 related bug in compute-runtime?
Because 21.10 went out of support, I did test with LLVM 12 [1] also on 22.04, and that worked fine, same as 21.10 earlier.
However, testing new compute-runtime release 22.29.23750 on 22.04 with its default LLVM 13 [2] still fails.
[1] LLVM 12:
- Packages: clang-12, llvm-12-dev, liblld-12-dev, libopencl-clang-12-dev, llvm-spirv-12, libllvmspirvlib-12-dev
- IGC: -DIGC_OPTION__LLVM_PREFERRED_VERSION=12
[2] LLVM 13:
- Packages: clang-13, llvm-13-dev, liblld-13-dev, libopencl-clang-13-dev, llvm-spirv, libllvmspirvlib-13-dev
- IGC: -DIGC_OPTION__LLVM_PREFERRED_VERSION=13
Exactly the same test failures have been observed by me when building against graphics compiler libraries (libigc1 and libigdfcl1 on Debian/Ubuntu systems) that were themselves built with either LLVM 13 or 14. All the failures are with the specific test givenArgumentDeclaredAsConstantWhenKernelIsCreatedThenArgumentIsMarkedAsReadOnly found in opencl/test/unit_test/kernel/kernel_tests.cpp and have been observed with these versions of compute runtime: 22.29.23750, 22.34.24023, and 22.35.24055 (latest release). An example of these test failures (essentially identical to failures seen by OP):
/build/intel-compute-runtime-22.35.24055/opencl/test/unit_test/kernel/kernel_tests.cpp:490: Failure
Value of: pKernelInfo->getArgDescriptorAt(0).isReadOnly()
Actual: false
Expected: true
[ FAILED ][ ADLN ][ 19777 ] KernelFromBinaryTests.givenArgumentDeclaredAsConstantWhenKernelIsCreatedThenArgumentIsMarkedAsReadOnly
The same test failures happened with the latest official attempted build of intel-compute-runtime for upcoming Ubuntu 22.10 Kinetic. See the official build log here (this was with graphics compiler 1.0.11702.1 built with LLVM 14):
https://launchpad.net/ubuntu/+source/intel-compute-runtime/22.34.24023-1/+build/24331481
For a display PPA that I maintain where the latest graphics compiler 1.0.12149 was built with LLVM 13 for Ubuntu 22.04 Jammy (latest official Ubuntu release) the same failures were also observed. To allow a successful build even with these test failures a patch to CMake config in cmake/run_ult_target.cmake was used to just give the "true" result regardless of actual test results. The full build log of this Ubuntu 22.04 Jammy build (again, with graphics compiler that was itself built with LLVM 13) is here:
https://launchpad.net/~savoury1/+archive/ubuntu/display/+build/24352215
Evidently some change in LLVM >= 13 (and graphics compiler then built with LLVM >= 13) causes the test givenArgumentDeclaredAsConstantWhenKernelIsCreatedThenArgumentIsMarkedAsReadOnly to fail.
This test passes fine on current Debian though. Debian migrated the intel ocl stack to LLVM 14 a while back.
I tried latest public release of the compute stack components, and compute-runtime tests continue to fail with LLVM 13.
This test passes fine on current Debian though. Debian migrated the intel ocl stack to LLVM 14 a while back.
Thanks, that's good to know!
I would like to do build on 22.04 Ubuntu LTS though, but that has only "libopencl-clang12" + "libopencl-clang13", although 22.04 already includes LLVM 14. Only Ubuntu 22.10 has "libopencl-clang14" [1], but 22.10 is not released yet.
=> Any idea whether "libopencl-clang14" will be included to future Ubuntu 22.04 LTS HWE stack updates?
I.e. will I be able to upgrade 22.04 LTS OCL builds to LLVM 14 [2], which is the next version after LLVM 11 to get production support[3] from Intel compute stack, or should I just switch to 22.10 or Debian testing? [4]
[1] Plus other related LLVM 14 based libs: https://packages.ubuntu.com/kinetic/libopencl-clang14 [2] Although Ubuntu 22.10 is still building Intel OCL with LLVM 11: https://packages.ubuntu.com/kinetic/intel-opencl-icd [3] According to: https://github.com/intel/intel-graphics-compiler/projects/2 [4] I know I could build also LLVM with OCL, but I'd rather switch distro than start doing that
I don't have plans to backport these
[2] is true because the newer upload to build with llvm14 doesn't build...
I don't have plans to backport these
OK.
[2] is true because the newer upload to build with llvm14 doesn't build...
Do you have bug on that (for compute-runtime or IGC)?
no bug, but seems there was an update to opencl-clang-14 in debian that kinetic didn't have, it cherry-picks https://github.com/intel/opencl-clang/pull/354 to fix some compute-runtime test error and I've synced o-c-14 to see if it helps with the build..
yup, that did it, phew..
Thanks @tjaalton for the tip about intel/opencl-clang#354 as that turned out to be the key to the test failures. Including the patch from that pull request in backported Debian package intel-opencl-clang 13.0.0-5 gives a successful build of the Intel OCL stack, specifically using LLVM 13 (not 14). This makes it clear that the culprit for the particular test failures was indeed the change included in SPIRV-LLVM-Translator >= 13.0.0 relative to kernel argument type parsing.
So first building intel-opencl-clang 13.0.0 but including intel/opencl-clang#354, then building IGC 1.0.12149.7 against this new libopencl-clang-dev (providing libopencl-clang-13-dev), followed by building latest compute-runtime 22.39.24347 against new libigdfcl1 and libigdfcl-dev, gives a completely successful test run. Again, using LLVM 13, not 14, for the entire stack.
For any Ubuntu users wanting these versions of the mentioned packages please see my display PPA which will have builds available for Ubuntu 16.04, 18.04, 20.04, and 22.04 LTS by end of today (currently in progress, IGC 1.0.12149.7 builds are underway and compute-runtime will be uploaded when those builds are finished and published).
It seems to have been pulled to all relevant opencl-clang LLVM branches:
- 13: https://github.com/intel/opencl-clang/pull/354
- 14: https://github.com/intel/opencl-clang/pull/365
- master: https://github.com/intel/opencl-clang/pull/366
As Timo commented that he's not backporting the stuff to 22.04, I guess this bug could be closed after it's documented that compute-runtime project does not support LLVM v13+ with Ubuntu 22.04 (or older), when using system (=buggy version of) opencl-clang.
[2] Although Ubuntu 22.10 is still building Intel OCL with LLVM 11: https://packages.ubuntu.com/kinetic/intel-opencl-icd
With the opencl-clang fix, 20.10 Intel OCL seems to have been updated to LLVM v14.