SYCLomatic icon indicating copy to clipboard operation
SYCLomatic copied to clipboard

performance increase with the always_inline attribute

Open zjin-lcf opened this issue 2 years ago • 2 comments

https://github.com/intel/llvm/issues/6583

The author described a technique to improve the performance of SYCL programs on NVIDIA GPUs. It may be added to the generated codes as comments.

zjin-lcf avatar Sep 16 '22 18:09 zjin-lcf

@zjin-lcf, not sure we can approve that “attribute((always_inline))” always gets good performance in SYCL side (inline can cause extra pressure on Registers). We will consider to implement them if that “attribute((always_inline))” always gets good performance is justified. Thanks.

tomflinda avatar Sep 28 '22 06:09 tomflinda

We have implemented a PR https://github.com/oneapi-src/SYCLomatic/pull/745 to introduce a new option to add inline for kernel function

tomflinda avatar Mar 28 '23 03:03 tomflinda