llvm
llvm copied to clipboard
[SYCL][CUDA] accessor_api_image CTS test is failing
Build compiler git clone https://github.com/intel/llvm Hash: b00fb7c
Includes: #1990, #1977
python /localdisk2/ws/againull/sycl/llvm/buildbot/configure.py --cuda -o /localdisk2/ws/againull/sycl/build python /localdisk2/ws/againull/sycl/llvm/buildbot/compile.py -o /localdisk2/ws/againull/sycl/build
Build accessor CTS tests git clone https://github.com/KhronosGroup/SYCL-CTS.git Hash: 9cbe1a719b25c269ef78a2ee08f2e5ed12a1cc6d
Applied: KhronosGroup/SYCL-CTS#52
cmake -G Ninja -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DINTEL_SYCL_ROOT=/localdisk2/ws/againull/sycl/build -DINTEL_SYCL_TRIPLE=nvptx64-nvidia-cuda-sycldevice -DSYCL_IMPLEMENTATION=Intel_SYCL -DSYCL_CTS_ENABLE_OPENCL_INTEROP_TESTS=Off -DSYCL_CTS_ENABLE_DOUBLE_TESTS=On -DSYCL_CTS_ENABLE_HALF_TESTS=On -DINTEL_SYCL_FLAGS="-Xsycl-target-backend;--cuda-gpu-arch=sm_50" -DOpenCL_INCLUDE_DIR=/localdisk2/ws/againull/sycl/build/include/sycl -DOpenCL_LIBRARY=/localdisk2/ws/againull/sycl/build/lib/libOpenCL.so ..
ninja test_accessor -j 12
Run accessor_api_image CTS test => ./bin/test_accessor -p nvidia -d opencl_gpu --test accessor_api_image --- accessor_api_image . accessor<vec<int32_t, 4>, 1, mode{1024}, target{2017}> . Checking get_range
PI CUDA ERROR: Value: 500 Name: CUDA_ERROR_NOT_FOUND Description: named symbol not found Function: build_program Source Location: /iusers/againull/sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:468
PI CUDA ERROR: Value: 400 Name: CUDA_ERROR_INVALID_HANDLE Description: invalid resource handle Function: cuda_piProgramRelease Source Location: /iusers/againull/sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:2807
. sycl exception caught . what - The program was built for 1 devices Build program log for 'GeForce GTX 1060 6GB': -999 (Unknown OpenCL error code) . line: 63 . a SYCL exception was caught: The program was built for 1 devices Build program log for 'GeForce GTX 1060 6GB': -999 (Unknown OpenCL error code)
- fail
--- accessor_api_image_fp16 . Device does not support half precision floating point operations
- pass
. Passed 1/14 tests (7%)
@againull, @pvchupin, I think we figured out that this was caused by the regression in the driver. Can we close this one?
@bader Yes. Test passes with 435.21 nvidia driver when https://github.com/KhronosGroup/SYCL-CTS/pull/52 is applied. Do you know when this PR it is going to be merged?
It looks like issue is still reproducible on latest driver 450.102.04. Let's reopen it, at least for the tracking purpose.
The 500 CUDA_ERROR_NOT_FOUND
(that turns in a 801 CUDA_ERROR_NOT_SUPPORTED
for CUDA toolkit 11.3 and above) is caused by the suq.depth
PTX instruction in 3d sampled readings at
https://github.com/intel/llvm/blob/90c8f0543a38adeda75ad2eca7e999a36a1f2697/libclc/ptx-nvidiacl/libspirv/images/image.cl#L151 https://github.com/intel/llvm/blob/90c8f0543a38adeda75ad2eca7e999a36a1f2697/libclc/ptx-nvidiacl/libspirv/images/image.cl#L158
NVIDIA tells that this error is expected since this PTX instructions work just in case they are used within the 'OpenCL driver'.
Unfortunately, they added that there are no ways to use this instruction with CUDA. They promised to update the PTX documentation in order to make it clear. If suq.depth
is present in the fatbin, it produces the aforementioned error even if it is not actually executed. For this reason, it has been removed with https://github.com/intel/llvm/pull/5378.
Another error emerged when the one above is factored out, a 700 CUDA_ERROR_ILLEGAL_ADDRESS
. Which appeared in sampled readings with linear filtering. This has been fixed with https://github.com/intel/llvm/pull/5204.
Unfortunately, the previous two errors are just the tip of the iceberg. Passing the accessor_api_image_core
test implies the support of (u)int
8bit channels and their related conversion functions, which are completely missing right now: (u)int
accessors can read (u)int_{32,16,8}b
channels. Further details can be found in Section 6.12.14 and 8.3 of the OpenCL 1.2 Specification.
In order to let this test pass, the image support has been marked as experimental and deactivated by default with https://github.com/intel/llvm/pull/5204.
In summary:
- the only type supported are
(u)int
(32bit),float
andhalf
, - 8bit channels and related conversion functions are missing,
- writings and non-sampled readings are supported,
- 1d/2d sampled readings are supported,
- 3d sampled readings are not functioning due to
suq.depth
.
@pgorlani, thanks a lot for nice and detailed summary!
Recent version of test_all
(and test_accessor_legacy
) aborts on NVIDIA GPU.
pi_die: PI CUDA kernels only support images with channel types int32, uint32, float, and half.
terminate called without an active exception
-------------------------------------------------------------------------------
accessor_api_image_core
-------------------------------------------------------------------------------
SYCL-CTS/tests/accessor_legacy/../common/../../util/proxy.h:35
...............................................................................
SYCL-CTS/tests/accessor_legacy/../common/../../util/proxy.h:35: FAILED:
due to a fatal error condition:
SIGABRT - Abort (abnormal termination) signal
Can we change pi_die
to an exception? pi_die
aborts the execution of the whole test suite, whereas exception will just fail a single test.
Recent version of
test_all
(andtest_accessor_legacy
) aborts on NVIDIA GPU.pi_die: PI CUDA kernels only support images with channel types int32, uint32, float, and half. terminate called without an active exception ------------------------------------------------------------------------------- accessor_api_image_core ------------------------------------------------------------------------------- SYCL-CTS/tests/accessor_legacy/../common/../../util/proxy.h:35 ............................................................................... SYCL-CTS/tests/accessor_legacy/../common/../../util/proxy.h:35: FAILED: due to a fatal error condition: SIGABRT - Abort (abnormal termination) signal
Can we change
pi_die
to an exception?pi_die
aborts the execution of the whole test suite, whereas exception will just fail a single test.
Hi @bader, here is https://github.com/intel/llvm/pull/6521, it should fit our needs.
This seems to be addressed by the change to an error report rather than exit/die so closing.