llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[SYCL] Device code generation for "free functions", a new way to define kernels

Open rdeodhar opened this issue 10 months ago • 14 comments

This PR is intended for review of the device-side code generation of "free functions".

The free function spec is here: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_free_function_kernels.asciidoc

This PR does not implement the full specification. It does however allow free function markup and execution using a manual method.

Later PRs will add the remaining parts of the spec. This is the current status:

  • Free functions are supported at file scope only.
  • The SYCL_EXTERNAL markup is needed for free functions in addition to the SYCL_EXT_ONEAPI_FUNCTION_PROPERTY property.
  • The traits described in the specification section "New traits for kernel functions" are not yet implemented.
  • The functions described in the specification section "New kernel bundle member functions" are not yet implemented. As a result, a free-function kernel can only be launched by finding the kernel_id that matches the function's name, and this requires knowing how the compiler mangles the function's name when creating the kernel. This is a temporary solution until the "New kernel bundle member functions" are implemented.
  • The compiler does not yet diagnose an error if the application violates any of the restrictions listed in the specification under the section "Defining a free function kernel".
  • Device code generation is supported for scalars and USM pointers only. It is not supported for complex kernel argument types requiring decomposition like accessor, local_accessor, or stream.
  • The implementation has not been tested to handle the case when a kernel argument is optimized away. The switch -fno-sycl-dead-args-optimization could be used to disable this optimization, if needed
  • The kernel information descriptor info::kernel::num_args cannot yet be used to query the number of arguments in a free function kernel.

rdeodhar avatar Mar 29 '24 15:03 rdeodhar

:white_check_mark: With the latest revision this PR passed the C/C++ code formatter.

github-actions[bot] avatar Mar 29 '24 15:03 github-actions[bot]

Regarding adding frontend tests, those use "fake" headers instead of the real SYCL headers. The recognition of free function kernels is based on function properties. Is there a way to fake the property setting on functions, too? Otherwise how to trigger free function code generation?

rdeodhar avatar Apr 01 '24 17:04 rdeodhar

The SemaSYCL formatting-only changes were not made intentionally. However, the formatting in my version is actually more "correct". Anyway, I'll put it back and see if it is accepted.

rdeodhar avatar Apr 01 '24 20:04 rdeodhar

Updated the PR description with current status of the implementation.

rdeodhar avatar Apr 01 '24 21:04 rdeodhar

Tests check-direct-attribute-propagation.cpp, check-notdirect-attribute-propagation.cpp and sycl-esimd-ast.cpp needed changes because they list function attributes in the sequence they appear in the AST. The order in which function attributes are added to a FunctionDecl has changed because MarkDevices is now called before integration header emission. MarkDevices adds several function attributes, including the single_task and nd_range properties which indicate that a function is a free function. Integration header emission adds the attribute AsmLabelAttr. MarkDevices had to be repositioned because the sequence of actions for handling free functions is: 1) attach function attributes to function decls, 2) recognize free functions from their attributes and process them, and 3) emit the integration header. This sequence of actions is visible in file Sema.cpp. The call to function "MarkDevices" must be made before integration header emission. Because the relative ordering of the call to function MarkDevices and integration header emission has been reversed, the AsmLabelAttr attribute gets repositioned.

rdeodhar avatar Apr 02 '24 03:04 rdeodhar

I have no more comments about the code, but I noticed more more bullet point to add to the TODO list:

  • The compiler front-end does not currently recognize a free-function kernel via the properties nd_range_kernel or single_task_kernel. As a result, free-function kernels must also be decorated with SYCL_EXTERNAL. This is a temporary limitation, and the need for adding SYCL_EXTERNAL will be removed later.

gmlueck avatar Apr 02 '24 11:04 gmlueck

The SemaSYCL formatting-only changes were not made intentionally. However, the formatting in my version is actually more "correct". Anyway, I'll put it back and see if it is accepted.

Thanks. I would prefer formatting changes to be in a separate PR if we want to change it. This PR is big enough as-is and it makes sense keeping just the required functional changes in this.

elizabethandrews avatar Apr 02 '24 21:04 elizabethandrews

Regarding adding frontend tests, those use "fake" headers instead of the real SYCL headers. The recognition of free function kernels is based on function properties. Is there a way to fake the property setting on functions, too? Otherwise how to trigger free function code generation?

Isn't it triggered by an attribute? Can't you just apply the attribute to the function?

elizabethandrews avatar Apr 02 '24 21:04 elizabethandrews

Regarding adding frontend tests, those use "fake" headers instead of the real SYCL headers. The recognition of free function kernels is based on function properties. Is there a way to fake the property setting on functions, too? Otherwise how to trigger free function code generation?

Isn't it triggered by an attribute? Can't you just apply the attribute to the function?

The attribute is normally expressed as [[sycl_detail::add_ir_attributes_function("sycl-single-task-kernel",1)]]. Now I expect the sycl_detail:: won't be usable, but if I use just [[add_ir_attributes_function("sycl-single-task-kernel",1)]] I get a message that that attribute is not recognized.

Is there another way to add that attribute?

rdeodhar avatar Apr 02 '24 22:04 rdeodhar

About frontend tests. I looked at existing tests in CodeGenSYCL and I think I know how to add them. Will do so.

Update: a frontend test has been added.

rdeodhar avatar Apr 03 '24 16:04 rdeodhar

About frontend tests. I looked at existing tests in CodeGenSYCL and I think I know how to add them. Will do so.

Update: a frontend test has been added.

Please add AST tests as well.

elizabethandrews avatar Apr 04 '24 15:04 elizabethandrews

About frontend tests. I looked at existing tests in CodeGenSYCL and I think I know how to add them. Will do so. Update: a frontend test has been added.

Please add AST tests as well.

OK, I'll convert the CodeGen test to an AST test. The code gen should be covered by the end-to-end test.

rdeodhar avatar Apr 04 '24 16:04 rdeodhar

About frontend tests. I looked at existing tests in CodeGenSYCL and I think I know how to add them. Will do so. Update: a frontend test has been added.

Please add AST tests as well.

OK, I'll convert the CodeGen test to an AST test. The code gen should be covered by the end-to-end test.

No please don't do that. We require both AST and CodeGen tests for the frontend. E2E tests are not as easy to debug for frontend issues.

elizabethandrews avatar Apr 04 '24 17:04 elizabethandrews

About frontend tests. I looked at existing tests in CodeGenSYCL and I think I know how to add them. Will do so. Update: a frontend test has been added.

Please add AST tests as well.

OK, I'll convert the CodeGen test to an AST test. The code gen should be covered by the end-to-end test.

No please don't do that. We require both AST and CodeGen tests for the frontend. E2E tests are not as easy to debug for frontend issues.

OK, I'll put back the CodeGen test.

rdeodhar avatar Apr 04 '24 18:04 rdeodhar

@intel/llvm-gatekeepers, at least one member of the teams that own files in this PR have given their approval. Could you merge it in, please?

rdeodhar avatar May 16 '24 16:05 rdeodhar