Add sycl examples
Add the example to implement muladd op with SyclExtension
Hi @goldsborough @zou3519 , could you help review this PR? Thank you!
What is Sycl?
What is Sycl?
Sycl is a C++ abstraction for heterogeneous programming allowing to write and execute code (kernels) on GPU devices. That's the way Intel uses to wrote kernels for Intel GPUs. A number of PyTorch aten operators are written using Sycl. And writing Sycl kernels is exposed to PyTorch users thru CPP Extension API. We are trying to update PyTorch tutorials/examples to demonstrate and encourage users to use Sycl to write custom kernels. See https://www.khronos.org/sycl/. Should we add a note on what is SYCL in the project readme?
@zou3519, xpu is now one of the official backends included in pytorch. It's backed by SYCL to generate device specific code and pytorch API to allow users create custom operators. Further, pytorch tutorial educates on usage of this API for C++ and CUDA (https://github.com/pytorch/tutorials/blob/main/advanced_source/cpp_custom_ops.rst). The SYCL cpp extension API we've added to pytorch extends this functionality adding SYCL into existing landscape. Obvious step to consider is to extend existing C++ and CUDA tutorial and example to also cover SYCL. It seems you are against this proposal. Can you, please, provide your reasoning to avoid adding SYCL in this tutorial and example and suggest alternative approach which will help to extend pytorch ecosystem to cover devices other than CUDA and other vendors?
fyi, proposal to change tutorial is here:
- https://github.com/pytorch/tutorials/pull/3391
It's currently draft as it depends on this PR in extension-cpp repository.
and suggest alternative approach which will help to extend pytorch ecosystem to cover devices other than CUDA and other vendors?
I would recommend having a separate extension-sycl repository showing users how to do a sycl extension for XPU. The main reasons are:
- if we end up supporting "all backends" in this repository, then this repo will get pretty bloated and difficult to work with. It'll be easier and faster for users to just clone an extension-sycl repo instead of having them clone extension-cpp and deleting the e.g. TPU and CUDA code.
- There's no code that is being reused between the SYCL and the CUDA extensions, so I don't see a good reason for putting all the code into one repo.
I would recommend having a separate extension-sycl repository
@zou3519 My concern is on visibility of repository name extensions-sycl. Note that extension-cpp is a generic name which basically invites to contribute other device backends into it - what we did with this PR. If extension-cpp is restricted to CUDA, then why it's not named extension-cuda? Also, keep in mind that having 2 repos, one for c++/cuda and another for c++/sycl will likely to code duplication around c++ as we want to have this path demonstrated to the users along with sycl. This still can be done as a design trade off, but I wanted to mention that.
@albanD : we are trying to cover sycl extension support in pytorch CPP extension API by tutorials/examples and there are 2 different proposals on the plate:
- Extend existing tutorials/examples
- Create new
Could you, please, help provide your opinion on the matter? Who else should we invite for the discussion?
For the particular case of this example repo, my read is that this is intended to be a cuda extension example. As such, it makes sense to me to have a separate example called extension-sycl that presents, in the simplest possible way, how to build extensions for sycl.