llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[SYCL] Make invoke_simd convert its arguments to appropriate type.

Open kbobrovs opened this issue 3 years ago • 4 comments

  • Introduce an intermediate lambda in invoke_simd and call SIMD target from the lambda with given (SIMD) arguments coming from the lambda's formal parameters. This way compiler automatically performs necessary argument type conversion.

  • Introduce a new implicit constructor for simd to allow conversion of simd objects with _VecExt storage kind (used in invoke_simd extension).

  • Fix linkonce_odr linkage of entry points : change it to external linkage to avoid removal by the inliner and llvm-linker.

This new scheme complicates the SIMD target function pointer flow, so LowerInvokeSimd.cpp update is needed to accommodate the new flow.

E2E test is underway.

Signed-off-by: Konstantin S Bobrovsky [email protected]

kbobrovs avatar Aug 09 '22 06:08 kbobrovs

Folks, I would appreciate if could you review the following parts:

@rolandschulz: sycl/include/std/experimental/simd.hpp sycl/include/sycl/ext/oneapi/experimental/invoke_simd.hpp sycl/test/invoke_simd/invoke_simd.cpp

@v-klochkov: llvm/lib/SYCLLowerIR/LowerInvokeSimd.cpp

@asudarsa: llvm/tools/sycl-post-link/ModuleSplitter.cpp llvm/tools/sycl-post-link/ModuleSplitter.h

kbobrovs avatar Aug 09 '22 06:08 kbobrovs

The llvm/test/SYCLLowerIR/ESIMD/lower_invoke_simd.ll failure will be fixed by test update (hopefully).

kbobrovs avatar Aug 09 '22 07:08 kbobrovs

They will only remove them if unused. Why do they incorrectly assume they are unused? Is it because you spit into SPMD and SIMD and the entry point into SIMD is unused inside the SIMD part? And therefore it gets removed after splitting?

No, it is not because of splitting. They (linker and inliner) correctly assume they are unused, because there can remain no calls to entry points within the module. In this case inliner turns this

invoke_simd(Helper,...)
Helper(...) {
  call UserF
}
UserF(...) { code; } // <-- declared as entry point in the user code (SYCL_EXTERNAL)

into

invoke_simd(Helper,...)
Helper(...) {
  code;
}

which is correct from the LLVMIR optimization standpoint, but not correct conceptually because UserF is declared as entry point and someone might want to link to it / call it later. Ideally, SYCL_EXTERNAL must not lead to linkonce* linkage, even for template instantiations (or linkonce* must be interpreted differently by inliner/linker, but this might be against its definition).

More practical example would be a template SYCL_EXTERNAL function.

kbobrovs avatar Aug 10 '22 23:08 kbobrovs

I'm pretty sure the intention of SYCL_EXTERNAL (as defined by SYCL spec and therefore not in the context of invoke_simd) was never to give a function external linkage if it didn't have it according to the C++ spec. Only to guarantee that a function which has external linkage also has external linkage for device linking. An implicit instantiated template function doesn't have external linkage in C++. You need to explicit instantiate it. And if you do that the LLVM-IR linkage is weak not linkonce. Which guarantees it works.

Ideally, SYCL_EXTERNAL must not lead to linkonce* linkage, even for template instantiations Therefore I disagree with that.

For the case of invoke_simd why is it possible that the inliner gets rid of the call. My understanding is that we don't support inlining through invoke_simd (yet).

rolandschulz avatar Aug 11 '22 01:08 rolandschulz

'm pretty sure the intention of SYCL_EXTERNAL (as defined by SYCL spec and therefore not in the context of invoke_simd) was never to give a function external linkage if it didn't have it according to the C++ spec. Only to guarantee that a function which has external linkage also has external linkage for device linking.

Hm. This excerpt from the spec + details in (5.10.1.) pretty much means SYCL_EXTERNAL = external linkage for me:

Any variable or function that is odr-used from a device function must be defined in the same translation unit as that use. However, a function may be defined in another translation unit if the implementation defines the SYCL_EXTERNAL macro as described in Section 5.10.1.

But WeakODR might be better in this case of implicit instantiation indeed. I changed the code.

kbobrovs avatar Aug 11 '22 05:08 kbobrovs

We discussed the linkonce_odr/SYCL_EXTERNAL problem with @rolandschulz and others in more details. Key takeaways:

  • SYCL_EXTERNAL does not affect function linkage as defined by C++ rules
  • linkonce_odr can be SYCL_EXTERNAL. But if no uses after optimizations - compiler is free to remove it.
  • To avoid removal, weak_odr can replace linkonce_odr w/o affecting correctness.

In this patch I do the linkage replacement for both helper and target. For helper this is permanent change, for target - temporary. Just to protect the target from removal when ModuleDesc::cleanup() is called on split ESIMD module, because there can be other uses of the target in the SYCL part of the program. Once ESIMD and SYCL are linked back together (this is what invoke_simd requires), cleanup is run on the linked module.

Original Pseudo-IR illustrating helper and target:

...
__builtin_invoke_simd(helper, target,...);
...

ret helper(callable f, args...) {
  f(args...);
}

Optimized to:

...
__builtin_invoke_simd(helper, ...);
...

ret helper(args...) {
  target(args...);
}

but there can be still other uses of target. Otherwise it gets removed.

kbobrovs avatar Aug 14 '22 00:08 kbobrovs

The only failure is InvokeSimd/invoke_simd_smoke.cpp, which is fixed by https://github.com/intel/llvm-test-suite/pull/1146

kbobrovs avatar Aug 14 '22 18:08 kbobrovs

/verify with https://github.com/intel/llvm-test-suite/pull/1146

kbobrovs avatar Aug 14 '22 18:08 kbobrovs

@asudarsa, @cperkinsintel - I resolved outstanding comments with @rolandschulz and @v-klochkov. Please take a look.

kbobrovs avatar Aug 15 '22 19:08 kbobrovs

Only nitpicks. LGTM.

@asudarsa, thanks - let me address your NIT comments in a separate PR to reduce the risk of conflicts for this big PR.

kbobrovs avatar Aug 15 '22 20:08 kbobrovs

Had to fix test failure (see 9dd341e), so I addressed @asudarsa's review comments as well (b4b6749). @rolandschulz, @v-klochkov, @asudarsa - please re-approve.

kbobrovs avatar Aug 16 '22 00:08 kbobrovs

/verify with https://github.com/intel/llvm-test-suite/pull/1146

kbobrovs avatar Aug 16 '22 00:08 kbobrovs

  • Expected failure:

    • SYCL / Linux / L0 GEN9 LLVM Test Suite (pull_request_target):
    • SYCL / Linux / OCL GEN9 LLVM Test Suite (pull_request_target):
      • InvokeSimd/invoke_simd_smoke.cpp, which is fixed by https://github.com/intel/llvm-test-suite/pull/1146
  • Unrelated failures:

    • SYCL / Linux / HIP AMDGPU LLVM Test Suite (pull_request_target) :
    • SYCL / Linux / CUDA LLVM Test Suite (pull_request_target):
      • SYCL :: DeviceLib/imf_simd_emulate_test.cpp

kbobrovs avatar Aug 16 '22 02:08 kbobrovs

@cperkinsintel, please review/approve for @intel/llvm-reviewers-runtime

kbobrovs avatar Aug 16 '22 07:08 kbobrovs

@cperkinsintel, please review/approve for @intel/llvm-reviewers-runtime

Actually, I have approval from @intel/llvm-reviewers-runtime - from @v-klochkov. Merging.

kbobrovs avatar Aug 16 '22 17:08 kbobrovs