llvm icon indicating copy to clipboard operation
llvm copied to clipboard

sycl spec make_device not working

Open ye-luo opened this issue 3 years ago • 14 comments

Describe the bug auto device = sycl::make_device<sycl::backend::ext_oneapi_level_zero>((_ze_device_handle_t*)hDevice); fails at run.

terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -30 (CL_INVALID_VALUE) -30 (CL_INVALID_VALUE

I have to do non-portable.

const sycl::platform sycl_platform=sycl::ext::oneapi::level_zero::make_platform(reinterpret_cast<pi_native_handle>(hPlatform));
auto device = sycl::ext::oneapi::level_zero::make_device(sycl_platform, reinterpret_cast<pi_native_handle>(hDevice));

it does work.

both above need me to include

#include <level_zero/ze_api.h>

This adds another level of complexity since this file is not shipped by the compiler but level-zero-dev package.

Another one I tried is

auto device = sycl::detail::make_device(reinterpret_cast<pi_native_handle>(hDevice), sycl::backend::ext_oneapi_level_zero);

it doesn't need ze_api.h header file. However, I got the same -30 error.

Does level-zero have spec compliant interoperability API implementation? Or I have to rely on the extension?

ye-luo avatar Mar 09 '22 21:03 ye-luo

A bit more background, I was testing interop with OpenMP from icpx

                auto hPlatform = omp_get_interop_ptr(o, omp_ipr_platform, &err);
                auto hContext = omp_get_interop_ptr(o, omp_ipr_device_context, &err);
                auto hDevice =  omp_get_interop_ptr(o, omp_ipr_device, &err);

ye-luo avatar Mar 09 '22 22:03 ye-luo

Full reproducer:

#include <sycl/sycl.hpp>
#include <omp.h>

int main() {
  omp_interop_t o = 0;
  #pragma omp interop init(targetsync: o)
  int err = -1;
  const auto hDevice =  static_cast<ze_device_handle_t>(omp_get_interop_ptr(o, omp_ipr_device, &err));
  assert (err >= 0 && "omp_get_interop_ptr(omp_ipr_device)");
  sycl::device D = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(hDevice);
  return 0;
}

Compile

icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interopt.cpp

Run

./a.out
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -30 (CL_INVALID_VALUE) -30 (CL_INVALID_VALUE)
Aborted

TApplencourt avatar Mar 09 '22 23:03 TApplencourt

@TApplencourt are you sure omp_get_interop_ptr returns ze_device_handle_t? We have a similarly simple test and it passes: https://github.com/intel/llvm-test-suite/blob/intel/SYCL/Plugin/interop-level-zero.cpp

alexbatashev avatar Mar 30 '22 11:03 alexbatashev

I think so. I found a workaround, this may help you diagnose what is going on internally. Maybe some L0 objects are not initialized when calling only make_device? (Note that I don't use the sycl::platform explicitly when creating the sycl::device)

#include <sycl/sycl.hpp>
#include <omp.h>

int main() {
  omp_interop_t o = 0;
  #pragma omp interop init(targetsync: o)
  int err = -1;
#ifdef _WA
  const ze_driver_handle_t hPlatform = static_cast<ze_driver_handle_t>(omp_get_interop_ptr(o, omp_ipr_platform, &err));
  assert (err >= 0 && "omp_get_interop_ptr(omp_ipr_platform)");
#endif
  const auto hDevice =  static_cast<ze_device_handle_t>(omp_get_interop_ptr(o, omp_ipr_device, &err));
  assert (err >= 0 && "omp_get_interop_ptr(omp_ipr_device)");
  #pragma omp interop destroy(o)
#ifdef _WA
  const sycl::platform sycl_platform = sycl::make_platform<sycl::backend::ext_oneapi_level_zero>(hPlatform);
#endif
  sycl::device D = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(hDevice);
  return 0;
}
$ icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interopt.cpp -D_WA
$ ./a.out
$ icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interopt.cpp
$ ./a.out
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  Native API failed. Native API returns: -30 (CL_INVALID_VALUE) -30 (CL_INVALID_VALUE)
Aborted

TApplencourt avatar Mar 30 '22 16:03 TApplencourt

@TApplencourt are you sure omp_get_interop_ptr returns ze_device_handle_t? We have a similarly simple test and it passes: https://github.com/intel/llvm-test-suite/blob/intel/SYCL/Plugin/interop-level-zero.cpp

Your example is sycl -> L0 -> sycl. There are likely implicit things in SYCL making it pass. We need real interop to work between OpenMP and SYCL.

ye-luo avatar Mar 30 '22 18:03 ye-luo

Correct what we want in a perfect world is:

#include <sycl/sycl.hpp>
#include <omp.h>

int main() {
  omp_interop_t o = 0;
  #pragma omp interop init(targetsync: o)
  int err = -1;
  const auto hPlatform = static_cast<pi_native_object>(omp_get_interop_ptr(o, omp_ipr_platform, &err));
  assert (err >= 0 && "omp_get_interop_ptr(omp_ipr_platform)");
   #pragma omp interop destroy(o)
  sycl::device D = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(hDevice);
  return 0;
}

TApplencourt avatar Mar 30 '22 18:03 TApplencourt

key requirement is to avoid

  1. any explicit L0 types like ze_device_handle_t. This should be hidden by pi_native_object and enums.
  2. any non standard APIs.

ye-luo avatar Mar 30 '22 19:03 ye-luo

@TApplencourt it would be nice to have such a SYCL & OpenMP interop example for the SYCL presentation! :-)

keryell avatar Apr 01 '22 21:04 keryell

To please @keryell I did more tests, who seem to have discovered a few new bugs. @alexbatashev should I open new tickets for those? I can also prepare a longer write-up if useful.

The code lives here: https://github.com/argonne-lcf/HPC-Patterns/blob/main/sycl_omp_ze_interopt/interop_omp_ze_sycl.cpp And tests OpenMP <-> L0 <-> SYCL Interopt.

The short story is that using the now deprecated sycl::level_zer0::make<sycl::device> and friend the code work.

icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interop_omp_ze_sycl.cpp
./a.out
OMP -> SYCL
   SYCL memcopy using OpenMP pointer
SYCL -> OMP
  OMP memcopy using SYCL pointer
Computation Done

When using the new free function sycl::make_device<sycl::backend::ext_oneapi_level_zero> and friend the code doesn't work.

  1. When changing
const sycl::device sycl_device = sycl::level_zero::make<sycl::device>(sycl_platform, hDevice);

to

const sycl::device sycl_device = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(hDevice);

Trigger a ( icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interop_omp_ze_sycl.cpp -DMAKE_DEVICE)

terminate called after throwing an instance of 'cl::sycl::invalid_parameter_error'
  what():  Queue cannot be constructed with the given context and device as the context does not contain the given device. -33 (CL_INVALID_DEVICE)
Aborted

The new function doesn't take a platform argument but not sure it that matter

  1. Whene Changing
const sycl::context sycl_context = sycl::ext::oneapi::level_zero::make<sycl::context>(sycl_devices, hContext,  sycl::ext::oneapi::level_zero::ownership::keep);

to

sycl::backend_input_t<sycl::backend::ext_oneapi_level_zero, sycl::context> hContextInteropInput = {hContext, sycl_devices};
const sycl::context sycl_context = sycl::make_context<sycl::backend::ext_oneapi_level_zero>(hContextInteropInput);

Make the code segfault (icpx -fiopenmp -fopenmp-targets=spir64 -fsycl interop_omp_ze_sycl.cpp -DMAKE_CONTEXT) . Look like make<sycl::context> doesn't have the KeepOwnership option anymore, maybe it's the problem.

TApplencourt avatar Apr 05 '22 16:04 TApplencourt

@TApplencourt it's fine, let's use this tracker.

+ @smaslov-intel, do you have any idea why these code samples do not work?

alexbatashev avatar Apr 05 '22 16:04 alexbatashev

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days.

github-actions[bot] avatar Oct 03 '22 02:10 github-actions[bot]

The two problems are still here (sycl::make_device trigger a cl::sycl::invalid_parameter_error, and sycl::make_context trigger a segfault) with Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.x.0.20220629)

A newer compiler added a direct OpenMP <-> Sycl interopt (aka Sycl can read direcly the OpenMP object), so this bug is less important for our particular use case. But for portability reason, I think this bug still matter a little bit :)

TApplencourt avatar Oct 03 '22 14:10 TApplencourt

I also get the same error. Here is a test that reproduces the error https://github.com/sogartar/make_sycl_device_from_level_zero_device_test/commit/2ee5e501e172bf7d5a6d02d3dba958ac7cb1beee

sogartar avatar Nov 21 '22 15:11 sogartar

Hi! There have been no updates for at least the last 60 days, though the ticket has assignee(s).

@AlexeySachkov, could I ask you to take one of the following actions? :)

  • Please provide an update if you have any (or just a small comment if you don't have any yet).
  • OR mark this issue with the 'confirmed' label if you have confirmed the problem/request and our team should work on it.
  • OR close the issue if it has been resolved.
  • OR take any other suitable action.

Thanks!

KornevNikita avatar May 17 '24 11:05 KornevNikita

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@AlexeySachkov, could you please take one of the following actions:

  • provide an update if you have any
  • unassign yourself if you're not looking / going to look into this issue
  • mark this issue with the 'confirmed' label if you have confirmed the problem/request and our team should work on it
  • close the issue if it has been resolved
  • take any other suitable action.

Thanks!

github-actions[bot] avatar Jul 17 '24 00:07 github-actions[bot]

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@AlexeySachkov, could you please take one of the following actions:

  • provide an update if you have any
  • unassign yourself if you're not looking / going to look into this issue
  • mark this issue with the 'confirmed' label if you have confirmed the problem/request and our team should work on it
  • close the issue if it has been resolved
  • take any other suitable action.

Thanks!

github-actions[bot] avatar Sep 15 '24 00:09 github-actions[bot]

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@AlexeySachkov, could you please take one of the following actions:

  • provide an update if you have any
  • unassign yourself if you're not looking / going to look into this issue
  • mark this issue with the 'confirmed' label if you have confirmed the problem/request and our team should work on it
  • close the issue if it has been resolved
  • take any other suitable action.

Thanks!

github-actions[bot] avatar Nov 15 '24 00:11 github-actions[bot]

Hi! There have been no updates for at least the last 60 days, though the issue has assignee(s).

@AlexeySachkov, could you please take one of the following actions:

  • provide an update if you have any
  • unassign yourself if you're not looking / going to look into this issue
  • mark this issue with the 'confirmed' label if you have confirmed the problem/request and our team should work on it
  • close the issue if it has been resolved
  • take any other suitable action.

Thanks!

github-actions[bot] avatar Jan 15 '25 00:01 github-actions[bot]

Hi! There have been no updates for at least the last 90 days, though the issue has assignee(s).

@AlexeySachkov, could you please take one of the following actions:

  • provide an update if you have any
  • unassign yourself if you're not looking / going to look into this issue
  • mark this issue with the 'confirmed' label if you have confirmed the problem/request and our team should work on it
  • close the issue if it has been resolved
  • take any other suitable action.

Thanks!

github-actions[bot] avatar Apr 15 '25 00:04 github-actions[bot]

#pragma omp interop device(id) init(prefer_type("level_zero"), targetsync : interop)
auto hDevice = omp_get_interop_ptr(interop, omp_ipr_device, &err);
const sycl::device sycl_device = sycl::make_device<sycl::backend::ext_oneapi_level_zero>(
    reinterpret_cast<const sycl::backend_input_t<sycl::backend::ext_oneapi_level_zero, sycl::device>>(hDevice));

currently works

ye-luo avatar Apr 15 '25 01:04 ye-luo