SYCLomatic
SYCLomatic copied to clipboard
performance increase with the always_inline attribute
https://github.com/intel/llvm/issues/6583
The author described a technique to improve the performance of SYCL programs on NVIDIA GPUs. It may be added to the generated codes as comments.
@zjin-lcf, not sure we can approve that “attribute((always_inline))” always gets good performance in SYCL side (inline can cause extra pressure on Registers). We will consider to implement them if that “attribute((always_inline))” always gets good performance is justified. Thanks.
We have implemented a PR https://github.com/oneapi-src/SYCLomatic/pull/745 to introduce a new option to add inline for kernel function