llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

sycl: unify unary kernels with a generic implementation and enable wide operator support

Open shani-f opened this issue 1 month ago • 4 comments

Summary

Adds a generic unary implementation for the SYCL backend, allowing many unary operators to share a single optimized execution path.
The implementation matches the behavior of the existing CPU unary kernels.

Changes

  • Added ggml_sycl_op_unary generic function
  • Updated unary dispatch in element_wise.cpp
  • Removed per-op SYCL kernels (ABS, SGN, NEG, STEP, etc.)
  • Updated documentation in:
    • docs/ops.md
    • docs/ops/SYCL.csv

Implementation

  • One templated kernel handles all unary ops
  • Supports 4-D tensors and non-contiguous views
  • Supports F16 and F32 data types
  • Uses dispatch_ggml_sycl_op_unary with parallel_for
  • Eliminates duplicated indexing logic across operators

Supported Ops

  • ABS
  • SGN
  • NEG
  • STEP
  • RELU
  • HARDSIGMOID
  • TANH
  • GELU
  • SILU
  • SIGMOID
  • HARDSWISH
  • GELU_QUICK
  • GELU_ERF
  • EXP
  • ELU

Testing

  • All supported unary ops pass test-backend-ops
  • Verified correctness on contiguous + non-contiguous tensors
  • Matches CPU results

Performance

  • Single optimized unary path for all ops
  • Reduced kernel count and maintenance complexity
  • Same SYCL scheduling style as existing ops

Compatibility

  • Works on OpenCL and Level Zero devices
  • No changes required for CPU fallback
  • Follows SYCL backend design conventions

shani-f avatar Nov 12 '25 15:11 shani-f

Hello @CISC @NeoZhangJianyu, The ggml-ci-x64-cpu-low-perf test failed, but it’s CPU-only and unrelated to my SYCL changes. What’s the next step toward merging? Thanks!

shani-f avatar Nov 12 '25 19:11 shani-f

Hello @CISC @NeoZhangJianyu, The ggml-ci-x64-cpu-low-perf test failed, but it’s CPU-only and unrelated to my SYCL changes. What’s the next step toward merging? Thanks!

We could merge directly after the conflicts are fixed.

NeoZhangJianyu avatar Nov 13 '25 00:11 NeoZhangJianyu

@shani-f Could you fix the conflicts?

Thank you!

NeoZhangJianyu avatar Nov 14 '25 01:11 NeoZhangJianyu

Hello @NeoZhangJianyu, Yesterday I resolved all conflicts and the branch was fully up to date. The new conflicts came from the latest merge into master, and I’ve resolved those as well. Thanks!

shani-f avatar Nov 14 '25 12:11 shani-f

Regen the CSV, at least TOPK_MOE should no longer be there.

CISC avatar Nov 15 '25 18:11 CISC

Is everything correct and finalized now?

shani-f avatar Nov 15 '25 18:11 shani-f

Is everything correct and finalized now?

I really did mean that you should regen it, there are more ops in the CSV that would be removed if you did.

CISC avatar Nov 15 '25 19:11 CISC

Hello @CISC, I tried to regenerate the CSV, but I couldn’t find any mechanism in the current LLAMA/ggml version that actually rebuilds SYCL.csv automatically. If you want, I can reset both ops.md and SYCL.csv to their upstream versions and leave them as-is.

Let me know what you prefer. Thanks!

shani-f avatar Nov 15 '25 21:11 shani-f

I tried to regenerate the CSV, but I couldn’t find any mechanism in the current LLAMA/ggml version that actually rebuilds SYCL.csv automatically.

As explained here, simply run test-backend-ops support --output csv from your build/bin folder. https://github.com/ggml-org/llama.cpp/blob/9ee5bdc72a2124b07ad786539660627cdc910ba9/docs/ops.md?plain=1#L7

CISC avatar Nov 15 '25 21:11 CISC

Hello @CISC, I really hope it’s okay now. Thank you so much for your help!

shani-f avatar Nov 15 '25 23:11 shani-f

Hello @CISC, I really hope it’s okay now. Thank you so much for your help!

Yes, perfect, thank you! :)

CISC avatar Nov 15 '25 23:11 CISC

@shani-f I find there are 4 OPs' status are chagned from support to unsupport。 But I test them with latest version: commit 2376b7758c58b0ede05de382bf72bb538f11ef9a (HEAD -> master, tag: b7083 All are supported (some are not full supported).

Could you confirm it?

./build/bin/test-backend-ops  -o  GROUP_NORM_MUL_ADD 
./build/bin/test-backend-ops  -o NORM_MUL_ADD
./build/bin/test-backend-ops  -o RMS_NORM_MUL_ADD 
 ./build/bin/test-backend-ops  -o SOFTCAP 

I think the method "Run test-backend-ops support --output csv with your backend name and redirect output to a csv file in docs/ops/ (e.g., docs/ops/CUDA.csv) " is with some issues.

We can't use above method to create CSV file and overwrite the existed CSV file directly. It will lead to this issue.

Recommend to test and merge CSV manually.

Thank you!

NeoZhangJianyu avatar Nov 17 '25 01:11 NeoZhangJianyu

@NeoZhangJianyu No, this is correct, those "ops" are fusion and are now filtered from the list.

Edit: If at some point test-backend-ops becomes capable of detecting that fusion happens they can be added back in.

CISC avatar Nov 17 '25 08:11 CISC

Why other backends support them if those ops are fusion in the MD?

run ./build/bin/test-backend-ops -o GROUP_NORM_MUL_ADD , we will see GROUP_NORM_MUL_ADD cases are passed. But in the ops.md, GROUP_NORM_MUL_ADD status is 'unsupported'.

If user want to run a new LLM including GROUP_NORM_MUL_ADD by llama.cpp, SYCL backend won't be ignored due to it's not supported shown by ops.md. It will give people wrong info.

NeoZhangJianyu avatar Nov 18 '25 00:11 NeoZhangJianyu

Why other backends support them if those ops are fusion in the MD?

It's outdated information, it will disappear for all other backends as well.

Every backend that has all the basic ops in the fusion passed these tests before, now they are filtered.

CISC avatar Nov 18 '25 08:11 CISC