llvm icon indicating copy to clipboard operation
llvm copied to clipboard

No diagnostic for conflicting kernel names in different source files

Open hvdijk opened this issue 7 months ago • 1 comments

Describe the bug

SYCL kernels are identified by their kernel name, which means kernel names have to be globally unique. This requirement is enforced within translation units, but not globally.

To reproduce

$ cat sycl.cc
void foo();
void bar();

int main() {
  foo();
  bar();
}
$ cat sycl1.cc
#include <iostream>
#include <sycl/sycl.hpp>

struct kernel;

void foo() {
  int var = 0;

  sycl::queue queue;
  {
    sycl::buffer<int, 1> buf(&var, sycl::range<1>(1));
    queue.submit([&] (sycl::handler& h) {
      auto acc = buf.get_access<sycl::access_mode::write>(h);
      h.single_task<kernel>([=] () {
        acc[0] = 1;
      });
    });
  }

  std::cout << "var = " << var << std::endl;
}
$ cat sycl2.cc
#include <iostream>
#include <sycl/sycl.hpp>

struct kernel;

void bar() {
  int var = 0;

  sycl::queue queue;
  {
    sycl::buffer<int, 1> buf(&var, sycl::range<1>(1));
    queue.submit([&] (sycl::handler& h) {
      auto acc = buf.get_access<sycl::access_mode::write>(h);
      h.single_task<kernel>([=] () {
        acc[0] = 2;
      });
    });
  }

  std::cout << "var = " << var << std::endl;
}
$ clang++ -fsycl sycl.cc sycl1.cc sycl2.cc -o sycl
$ ./sycl
var = 1
var = 1
$ 

Note how bar() executed the wrong kernel.

Environment

  • OS: Linux
  • Target device: Intel OpenCL CPU
  • DPC++ version: d4f2fe54047a1b415af2402a497f20e918094580
  • Dependencies version:
$ sycl-ls --verbose
[opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz OpenCL 3.0 (Build 0) [2024.17.5.0.08_160000.xmain-hotfix]
[opencl:cpu][opencl:1] Portable Computing Language, cpu-haswell-Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz OpenCL 3.0 PoCL HSTR: cpu-x86_64-pc-linux-gnu-haswell [5.0+debian]
[opencl:fpga][opencl:2] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2024.17.5.0.08_160000.xmain-hotfix]
[native_cpu:cpu][native_cpu:0] SYCL_NATIVE_CPU, SYCL Native CPU 0.1 [0.0.0]

Platforms: 4
Platform [#1]:
    Version  : OpenCL 3.0 LINUX
    Name     : Intel(R) OpenCL
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type              : cpu
        Version           : OpenCL 3.0 (Build 0)
        Name              : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
        Vendor            : Intel(R) Corporation
        Driver            : 2024.17.5.0.08_160000.xmain-hotfix
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : cpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_system_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_oneapi_srgb ext_oneapi_native_assert ext_intel_legacy_image ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_private_alloca
        info::device::sub_group_sizes: 4 8 16 32 64
        Architecture: x86_64
Platform [#2]:
    Version  : OpenCL 3.0 PoCL 5.0+debian  Linux, None+Asserts, RELOC, SPIR, LLVM 16.0.6, SLEEF, DISTRO, POCL_DEBUG
    Name     : Portable Computing Language
    Vendor   : The pocl project
    Devices  : 1
        Device [#1]:
        Type              : cpu
        Version           : OpenCL 3.0 PoCL HSTR: cpu-x86_64-pc-linux-gnu-haswell
        Name              : cpu-haswell-Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
        Vendor            : GenuineIntel
        Driver            : 5.0+debian
        Num SubDevices    : 0
        Num SubSubDevices : 0
        Aspects           : cpu fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_oneapi_srgb ext_intel_legacy_image ext_oneapi_ballot_group ext_oneapi_fixed_size_group ext_oneapi_opportunistic_group ext_oneapi_tangle_group ext_oneapi_limited_graph ext_oneapi_private_alloca
        info::device::sub_group_sizes: 1 2 4 8 16 32 64 128 256 512
        Architecture: SYCL Exception encountered: Native API failed. Native API returns: -30 (PI_ERROR_INVALID_VALUE) -30 (PI_ERROR_INVALID_VALUE)

Additional context

When the translation units are combined into a single file, DPC++ does diagnose this:

$ cat sycl.cc sycl1.cc sycl2.cc > combined.cc
$ clang++ -fsycl combined.cc -o sycl
combined.cc:42:29: error: definition with same mangled name '_ZTS6kernel' as another definition
   42 |       h.single_task<kernel>([=] () {
      |                             ^
combined.cc:21:29: note: previous definition is here
   21 |       h.single_task<kernel>([=] () {
      |                             ^
1 error generated.

This affects SYCL-CTS test_all, https://github.com/KhronosGroup/SYCL-CTS/issues/904.

I initially added a comment to https://github.com/intel/llvm/issues/10659 about this but this actually looks like a different issue that manifests itself in exactly the same way. 10659 is about unnamed kernels, where the compiler-generated name is not unique. This is about named kernels, where the user-specified name is not unique. A solution for one will not be a solution for the other: the former requires the compiler to change how it names kernels, the latter should result in a clear error.

hvdijk avatar Jun 25 '24 09:06 hvdijk