llvm icon indicating copy to clipboard operation
llvm copied to clipboard

group_ballot implementation is missing for AMD GPU/HIP targets

Open stefanatwork opened this issue 3 years ago • 0 comments

Describe the bug Code that uses sycl::ext::oneapi::group_ballot will fail with a linker error when compiling to amdgcn-amd-amdhsa.

Error message: lld: error: undefined hidden symbol: __spirv_GroupNonUniformBallot(unsigned int, bool)

To Reproduce

  1. Create a source file test.cpp with this code:
#include <CL/sycl.hpp>
#define N 256
int main()
{
  sycl::queue queue;
  sycl::buffer<int, 1> A(N);
  sycl::buffer<int, 1> B(N);

  sycl::host_accessor A_host_acc(A, sycl::write_only);
  for (size_t i = (size_t)0; i < N; i++) {
    A_host_acc[i] = rand() % 32;
  }

  queue.submit([&](sycl::handler &cgh) {
    sycl::accessor A_acc(A, cgh, sycl::read_only);
    sycl::accessor B_acc(B, cgh, sycl::write_only, sycl::no_init);

    cgh.parallel_for<class ballot>(N, [=]  (sycl::id<1> idx) { 
      B_acc[idx] = sycl::ext::oneapi::group_ballot(sycl::ext::oneapi::experimental::this_sub_group(), A_acc[idx] > 0.5f).count();
      });
  });
  queue.wait();
  return 0 ;
}
  1. Compile with clang++ -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend --offload-arch=gfx1032 -o test test.cpp
  2. Expected result: compiler exits without error, producing an executable named test
  3. Actual result: compiler exits with error
lld: error: undefined hidden symbol: __spirv_GroupNonUniformBallot(unsigned int, bool)
>>> referenced by lto.tmp:(typeinfo name for sycl::_V1::detail::__pf_kernel_wrapper<main::'lambda'(sycl::_V1::handler&)::operator()(sycl::_V1::handler&) const::ballot>)
>>> referenced by lto.tmp:(typeinfo name for sycl::_V1::detail::__pf_kernel_wrapper<main::'lambda'(sycl::_V1::handler&)::operator()(sycl::_V1::handler&) const::ballot>)
>>> referenced by lto.tmp:(_ZTSN4sycl3_V16detail19__pf_kernel_wrapperIZZ4mainENKUlRNS0_7handlerEE_clES4_E6ballotEE_with_offset)
>>> referenced 5 more times

Environment:

  • OS: Ubuntu 20.04
  • Target device and vendor: AMD Radeon RX6600 (gfx1032)
  • DPC++ version: 85a6833d6cfcc7625d1cd3d7513bf7f0312712f3
  • Dependencies version: HIP 5.2.0

stefanatwork avatar Sep 07 '22 14:09 stefanatwork