sycl: unify unary kernels with a generic implementation and enable wide operator support
Summary
Adds a generic unary implementation for the SYCL backend, allowing many unary operators to share a single optimized execution path.
The implementation matches the behavior of the existing CPU unary kernels.
Changes
- Added
ggml_sycl_op_unarygeneric function - Updated unary dispatch in
element_wise.cpp - Removed per-op SYCL kernels (ABS, SGN, NEG, STEP, etc.)
- Updated documentation in:
docs/ops.mddocs/ops/SYCL.csv
Implementation
- One templated kernel handles all unary ops
- Supports 4-D tensors and non-contiguous views
- Supports
F16andF32data types - Uses
dispatch_ggml_sycl_op_unarywithparallel_for - Eliminates duplicated indexing logic across operators
Supported Ops
- ABS
- SGN
- NEG
- STEP
- RELU
- HARDSIGMOID
- TANH
- GELU
- SILU
- SIGMOID
- HARDSWISH
- GELU_QUICK
- GELU_ERF
- EXP
- ELU
Testing
- All supported unary ops pass
test-backend-ops - Verified correctness on contiguous + non-contiguous tensors
- Matches CPU results
Performance
- Single optimized unary path for all ops
- Reduced kernel count and maintenance complexity
- Same SYCL scheduling style as existing ops
Compatibility
- Works on OpenCL and Level Zero devices
- No changes required for CPU fallback
- Follows SYCL backend design conventions
Hello @CISC @NeoZhangJianyu, The ggml-ci-x64-cpu-low-perf test failed, but it’s CPU-only and unrelated to my SYCL changes. What’s the next step toward merging? Thanks!
Hello @CISC @NeoZhangJianyu, The ggml-ci-x64-cpu-low-perf test failed, but it’s CPU-only and unrelated to my SYCL changes. What’s the next step toward merging? Thanks!
We could merge directly after the conflicts are fixed.
@shani-f Could you fix the conflicts?
Thank you!
Hello @NeoZhangJianyu, Yesterday I resolved all conflicts and the branch was fully up to date. The new conflicts came from the latest merge into master, and I’ve resolved those as well. Thanks!
Regen the CSV, at least TOPK_MOE should no longer be there.
Is everything correct and finalized now?
Is everything correct and finalized now?
I really did mean that you should regen it, there are more ops in the CSV that would be removed if you did.
Hello @CISC, I tried to regenerate the CSV, but I couldn’t find any mechanism in the current LLAMA/ggml version that actually rebuilds SYCL.csv automatically. If you want, I can reset both ops.md and SYCL.csv to their upstream versions and leave them as-is.
Let me know what you prefer. Thanks!
I tried to regenerate the CSV, but I couldn’t find any mechanism in the current LLAMA/ggml version that actually rebuilds SYCL.csv automatically.
As explained here, simply run test-backend-ops support --output csv from your build/bin folder.
https://github.com/ggml-org/llama.cpp/blob/9ee5bdc72a2124b07ad786539660627cdc910ba9/docs/ops.md?plain=1#L7
Hello @CISC, I really hope it’s okay now. Thank you so much for your help!
Hello @CISC, I really hope it’s okay now. Thank you so much for your help!
Yes, perfect, thank you! :)
@shani-f I find there are 4 OPs' status are chagned from support to unsupport。 But I test them with latest version: commit 2376b7758c58b0ede05de382bf72bb538f11ef9a (HEAD -> master, tag: b7083 All are supported (some are not full supported).
Could you confirm it?
./build/bin/test-backend-ops -o GROUP_NORM_MUL_ADD
./build/bin/test-backend-ops -o NORM_MUL_ADD
./build/bin/test-backend-ops -o RMS_NORM_MUL_ADD
./build/bin/test-backend-ops -o SOFTCAP
I think the method "Run test-backend-ops support --output csv with your backend name and redirect output to a csv file in docs/ops/ (e.g., docs/ops/CUDA.csv) " is with some issues.
We can't use above method to create CSV file and overwrite the existed CSV file directly. It will lead to this issue.
Recommend to test and merge CSV manually.
Thank you!
@NeoZhangJianyu No, this is correct, those "ops" are fusion and are now filtered from the list.
Edit: If at some point test-backend-ops becomes capable of detecting that fusion happens they can be added back in.
Why other backends support them if those ops are fusion in the MD?
run ./build/bin/test-backend-ops -o GROUP_NORM_MUL_ADD , we will see GROUP_NORM_MUL_ADD cases are passed.
But in the ops.md, GROUP_NORM_MUL_ADD status is 'unsupported'.
If user want to run a new LLM including GROUP_NORM_MUL_ADD by llama.cpp, SYCL backend won't be ignored due to it's not supported shown by ops.md. It will give people wrong info.
Why other backends support them if those ops are fusion in the MD?
It's outdated information, it will disappear for all other backends as well.
Every backend that has all the basic ops in the fusion passed these tests before, now they are filtered.