llama.cpp opencl: Add support for multiple devices

... but limited to one platform for now. A platform with a GPU will be preferred.

Additionally:

Filter out devices that lack capabilities needed by the backend implementation (half support, OpenCL 2.0+, etc).
Make ggml_backend_opencl_reg() thread-safe.

Mar 28 '25 08:03 linehill

@lhez please take a look. It makes sense to add multi-device support.

@linehill please rebase once we merge #12886 when you get the chance

Apr 11 '25 04:04 max-krasnyansky

Gentle ping, @max-krasnyansky, @lhez. Is this PR good for landing?

May 05 '25 13:05 linehill

Thank you @linehill, it looks good.

May 06 '25 06:05 lhez

@max-krasnyansky ping - I think this PR should be good to merge.

May 14 '25 17:05 lhez

Does this PR bring back support for AMD/Nvidia GPUs or is it still missing?

I would like to compare OpenCL and Vulkan performance.

May 22 '25 00:05 acbits

Does this PR bring back support for AMD/Nvidia GPUs or is it still missing?

They aren't supported, at least because of the device whitelist in here. The backend might work on AMD and NVidia OpenCL drivers if you remove this line but beware - the current kernel implementations seems to be tailored for Intel and Qualcomm HW.

May 22 '25 10:05 linehill

The problem is some of the kernels use subgroups and need to know the subgroup size and Nvidia's OpenCL implementation does not support subgroups. I think AMD has subgroups support in OpenCL, so it should be relatively easy to enable AMD.

May 22 '25 18:05 lhez