chipStar
chipStar copied to clipboard
rocPRIM test_warp_load failing with Intel CPU driver
Both Intel's and PoCL's CPU drivers produce wrong results with this quite simple test. As it's very short test it's something to investigate and could affect other kernels too. Maybe assumes warp lock-step semantics or multiple writers to same destination to succeed or such.
Still fails with PoCL-CPU (v4.0) and Intel CPU.
[ PASSED ] 16 tests.
[ FAILED ] 14 tests, listed below:
[ FAILED ] WarpLoadTest/8.WarpLoad, where TypeParam = Params<int, 4u, 32u, (rocprim::warp_load_method)0>
[ FAILED ] WarpLoadTest/8.WarpLoadGuarded, where TypeParam = Params<int, 4u, 32u, (rocprim::warp_load_method)0>
[ FAILED ] WarpLoadTest/9.WarpLoad, where TypeParam = Params<int, 4u, 32u, (rocprim::warp_load_method)1>
[ FAILED ] WarpLoadTest/9.WarpLoadGuarded, where TypeParam = Params<int, 4u, 32u, (rocprim::warp_load_method)1>
[ FAILED ] WarpLoadTest/10.WarpLoad, where TypeParam = Params<int, 4u, 32u, (rocprim::warp_load_method)2>
[ FAILED ] WarpLoadTest/10.WarpLoadGuarded, where TypeParam = Params<int, 4u, 32u, (rocprim::warp_load_method)2>
[ FAILED ] WarpLoadTest/11.WarpLoad, where TypeParam = Params<HIP_vector_type<float, 2u>, 4u, 32u, (rocprim::warp_load_method)0>
[ FAILED ] WarpLoadTest/11.WarpLoadGuarded, where TypeParam = Params<HIP_vector_type<float, 2u>, 4u, 32u, (rocprim::warp_load_method)0>
[ FAILED ] WarpLoadTest/12.WarpLoad, where TypeParam = Params<HIP_vector_type<float, 2u>, 4u, 32u, (rocprim::warp_load_method)1>
[ FAILED ] WarpLoadTest/12.WarpLoadGuarded, where TypeParam = Params<HIP_vector_type<float, 2u>, 4u, 32u, (rocprim::warp_load_method)1>
[ FAILED ] WarpLoadTest/13.WarpLoad, where TypeParam = Params<HIP_vector_type<float, 2u>, 4u, 32u, (rocprim::warp_load_method)2>
[ FAILED ] WarpLoadTest/13.WarpLoadGuarded, where TypeParam = Params<HIP_vector_type<float, 2u>, 4u, 32u, (rocprim::warp_load_method)2>
[ FAILED ] WarpLoadTest/14.WarpLoad, where TypeParam = Params<HIP_vector_type<float, 2u>, 4u, 32u, (rocprim::warp_load_method)3>
[ FAILED ] WarpLoadTest/14.WarpLoadGuarded, where TypeParam = Params<HIP_vector_type<float, 2u>, 4u, 32u, (rocprim::warp_load_method)3>
Works with Intel iGPU via OpenCL, Intel iGPU via Level Zero and PoCL via the Level0 driver, so not blocking the 1.0 due to this.
This still fails with Intel OpenCL CPU, so keeping it open.