llvm [SYCL] Do not attach reqd_work_group_size info when multiple are detected

[SYCL] Do not attach reqd_work_group_size info when multiple are detected

Open jzc opened this issue 10 months ago • 2 comments

Apr 22 '24 20:04 jzc

Are we allowed to discard a requirement just because they are contradictory? From https://intel.github.io/llvm-docs/design/OptionalDeviceFeatures.html:

For a kernel that is decorated with the [[sycl::reqd_work_group_size(W)]] or [[sycl::reqd_sub_group_size(S)]] attribute, the exception must be thrown if the device does not support the work group size W or the sub-group size S.

We can't honor this requirement if we discard reqd_work_group_size.

May 07 '24 21:05 LU-JOHN

Are we allowed to discard a requirement just because they are contradictory? From https://intel.github.io/llvm-docs/design/OptionalDeviceFeatures.html:

For a kernel that is decorated with the [[sycl::reqd_work_group_size(W)]] or [[sycl::reqd_sub_group_size(S)]] attribute, the exception must be thrown if the device does not support the work group size W or the sub-group size S.

We can't honor this requirement if we discard reqd_work_group_size.

The situation when we discard that metadata and therefore lose ability to emit that error can only happen when a user explicitly specifies non-standard -fsycl-device-code-split=off. We have not claimed to be fully conformant with the SYCL specification with that flag.

Essentially this is a trade-off between user experience and being conformant. The problem with user experience we had is that we also have a check that local size passed to parallel_for is the same as what is attached as an attribute to a kernel. Since we record an attribute on a per-device-image basis assuming that it is the same for all kernels, this caused false alarms, fully preventing users from launching any kernels. Disabled device code split path is essentially a default for FPGA devices and therefore we decided to go this way.

May 14 '24 08:05 AlexeySachkov

llvm llvm copied to clipboard

[SYCL] Do not attach reqd_work_group_size info when multiple are detected

llvm
llvm copied to clipboard