stable-diffusion-webui [Feature Request]: Support for Intel Oneapi/Vulkan versions of pytorch as well

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

This is a brilliant project and I like that it supports most versions of pytorch.

A large group of users on unsupported machines, intel, windows, etc get excluded from the performance options (which are basically cuda and wannabe-cuda) . Many of these machines have fairly decent hardware, just that it doesn't run cuda/rocm. There are pytorch versions like oneapi or vulkan, etc that would really take the reach of this project out to those with lesser machines, so to say. https://pytorch.org/tutorials/recipes/recipes/intel_extension_for_pytorch.html

I'm not a coder, but they have a pytorch version in the works similar to cuda/rocm, but it seems to support a lot of intel CPUs and GPUs, including discrete GPUs and older ones abandoned by ROCm https://github.com/intel/intel-extension-for-pytorch/tree/xpu-master

and adapting the code doesn't seem to be excessively complicated. https://intel.github.io/intel-extension-for-pytorch/xpu/1.10.200+gpu/tutorials/examples.html https://intel.github.io/intel-extension-for-pytorch/xpu/1.10.200+gpu/tutorials/api_doc.html

It would make the project accessible to those with simpler laptops/desktops.

https://towardsdatascience.com/pytorch-stable-diffusion-using-hugging-face-and-intel-arc-77010e9eead6

Proposed workflow

Go to ....
Press ....
...

Additional information

No response

Jan 06 '23 08:01 Vidyut

Yes, it would be nice to squeeze those 16gb from the intel arc 770. It seems that the problem resides at PyTorch https://github.com/pytorch/pytorch/issues/30029. PyTorch will need to support One API. But it seems it's possible to run PyTorch with intel GPUs through extensions as is stated in https://github.com/intel/intel-extension-for-pytorch/tree/xpu-master

This thread from Reddit has useful information about the possible ways Intel could approach Stable Diffusion /PyTorch problem https://www.reddit.com/r/intel/comments/xvbmif/will_intel_arc_support_stable_diffusion/

Jan 06 '23 19:01 uxdesignerhector

Hi @uxdesignerhector,

Pytorch HAS the intel extension, though unlike ROCm, it requires code changes as it stands. It is just a couple of lines - this project already seems to do similar to integrate mps, which is why I suggested here. But it can accelerate CPUs and the unreleased version runs on older GPUs and what not, which is great! I wouldn't be surprised if such an integration made this version of Stable Diffusion the staple implementation.

Stable Diffusion runs on TesorFlow, I think, which supports OneAPI - so this is less an Intel issue and more of one for those who love this project, with its well designed implementation, but would like to not wait ages while their hardware twiddles its thumbs. Almost nothing (that wouldn't crash at the task) would be left out, since it would also automatically support OpenCL I think.

Not to mention I am fed up of these elitist projects refusing to recognise anything not CUDA as not GPU!!! (this includes Intel's openvino-gpu runtime - which is basically for cuda/rocm!!!) This repository with its inclusion of everything it can lay its hands on is literally the only reason I bother with pytorch. (that said, not a coder, so not like I'm using all sorts of other technologies overwhelmingly)

V

Jan 06 '23 23:01 Vidyut

The intel extension for gpu now supports pytorch 1.13.10 https://github.com/intel/intel-extension-for-pytorch/releases/tag/v1.13.10%2Bxpu

Jan 09 '23 15:01 Vidyut

For anyone looking for working code for Stable Diffusion on Intel dGPUs (Arc Alchemist ) and iGPUs with PyTorch and TensorFlow, please check this out: https://github.com/rahulunair/stable_diffusion_arc or my blog: https://blog.rahul.onl/posts/2022-09-06-arc-dgpu-stable-diffusion.html

For context, oneAPI is already part of PyTorch and TensorFlow as oneDNN, which is a oneAPI library is the default accelerator for CPUs that both the frameworks uses. And Intel extensions for PyTorch (ipex) are kernels that support further optimizations and Intel GPU backend. Eventually most of the code from ipex would be merged into mainline PyTorch.

Jan 12 '23 23:01 rahulunair

For anyone looking for working code for Stable Diffusion on Intel dGPUs (Arc Alchemist ) and iGPUs with PyTorch and TensorFlow, please check this out: https://github.com/rahulunair/stable_diffusion_arc or my blog: https://blog.rahul.onl/posts/2022-09-06-arc-dgpu-stable-diffusion.html

For context, oneAPI is already part of PyTorch and TensorFlow as oneDNN, which is a oneAPI library is the default accelerator for CPUs that both the frameworks uses. And Intel extensions for PyTorch (ipex) are kernels that support further optimizations and Intel GPU backend. Eventually most of the code from ipex would be merged into mainline PyTorch.

Thank you for your clarification.

Jan 13 '23 20:01 uxdesignerhector

https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/4690

Jan 17 '23 18:01 jbaboval

I'm going to take a stab at putting together a PR for this.....

Jan 17 '23 18:01 jbaboval

Unfortunately it's more than a few lines of code. And getting the intel libraries and drivers setup isn't well integrated with distributions.

This is a work in progress, but it shows signs of life: https://github.com/jbaboval/stable-diffusion-webui/tree/oneapi

Jan 18 '23 02:01 jbaboval

I'm still having some issues. One is seeding. I can't get reproducible output. I thought it might be the seeding in pytorch_lightning, but at this point I have implemented full support in pytorch_lightning and instrumented the seeding code there - it never gets called. All the seeding happens in sd-webui. I've also instrumented sd-webui to validate repeatability of noise and subnoise, and it's fully repeatable. Not sure what gives yet.

The other issue is that batches always have junk for the second image.

On the plus side, it's really fast. Especially compared to my old GTX1660.

Jan 21 '23 01:01 jbaboval

https://github.com/intel/intel-extension-for-pytorch/issues/252

Jan 22 '23 16:01 jbaboval

I'm not a coder. I can't even begin to figure this out, but I'd happy to test if you've uploaded what you have to github.

Jan 22 '23 17:01 Vidyut

It's linked above. I made some notes in ArcNotes.txt that might help get you set up.

Jan 22 '23 21:01 jbaboval

If you're going to try the branch above:

It might not work without my pytorch_lightning branch. I think it will, but if not let me know. I can test later and fix it.
Turn up the batch size
Pass --use-intel-oneapi to launch.py
Pass --config configs/v1-inference-xpu.yaml to launch.py

Jan 22 '23 21:01 jbaboval

Saw your comment just now and tried it.

I had everything installed and the preparation was fine as per your test.

--use-intel-oneapi wasn't recognised. So I probably did something wrong.

The command to make the Intel version of python a system default is problematic and I almost broke other python stuff going on. Better to use the setup vars in a launcher for use only for this or add to .bashrc (and comment out when not needed...).

Something like a small script: #!/bin/bash #That way you can comment out code options to comment or uncomment quickly to test also (and if you're like me, not forget commands) . /opt/intel/oneapi/setvars.sh TORCH_COMMAND='pip install torch torchvision' python launch.py --medvram --precision full --no-half --skip-torch-cuda-test

That said:

Stable diffusion started without trouble, loaded webpage. The code didn't break.
Takes a long time to draw a single image
But I'm not sure it is using the XPU
Will need more investigation and tweaking. Work in progress. Will update.

For reference (the conspicuous lack of xpu, etc words suggests I missed a trick somewhere)

`:: oneAPI environment initialized ::

Python 3.9.15 (main, Nov 11 2022, 13:58:57) [GCC 11.2.0] Commit hash: 3a0d6b77295162146d0a8d04278804334da6f1b4 Installing requirements for Web UI Launching Web UI with arguments: --medvram --precision full --no-half --ckpt /home/[stuff]/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt No module 'xformers'. Proceeding without it. Warning: caught exception 'Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx', memory monitor disabled LatentDiffusion: Running in eps-prediction mode DiffusionWrapper has 859.52 M params. Loading weights [fe4efff1e1] from /home/[stuff]/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt Applying cross attention optimization (InvokeAI). Textual inversion embeddings loaded(0): Model loaded in 119.4s (1.8s create model, 109.9s load weights). Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). 100%|███████████████████████████████████████████| 20/20 [09:44<00:00, 29.21s/it] Total progress: 100%|███████████████████████████| 20/20 [09:18<00:00, 32.14s/it] ` Will be able to spend time properly on this in a day or so and update any enlightenment that follows.

Jan 23 '23 10:01 Vidyut

Arrrgh. Never mind. The torch version was wrong (I accidentally installed it in the regular python, so the script installed the regular torch in intel's python...). Now sorted. And now I have problems with Intel's torch and torchvision playing nice with each other... trying with intel's torch and regular torchvision. Sigh.

Update: Fails with Intel's torchvision, but works with Intel's torch and regular torchvision. But still takes too long. Probably because I can't convince it to use the parameters you said to pass. The xpu test returns true, so requirements are installed. But I don't think it is using the xpu still.

This is currently slower than untampered CPU. 100%|███████████████████████████████████████████| 20/20 [10:11<00:00, 30.56s/it] Total progress: 100%|███████████████████████████| 20/20 [10:27<00:00, 31.38s/it]

Jan 23 '23 11:01 Vidyut

Can you check what branch of the fork you're on? It should be oneapi.

If it's not recognizing the command line option, it's definitely not running the right code.

Jan 23 '23 13:01 jbaboval

Okay, you were right. It was the wrong branch. facepalm. And I downloaded the zip and I still think it is the master. Not sure how to get the oneapi branch (I'm a champion copy-paster, but don't actually know a lot). Figuring it out.

Jan 23 '23 16:01 Vidyut

I'm not sure how you get the branch with the zip download. I just grabbed the zip and it doesn't include the .git directory.

Try git clone -b oneapi https://github.com/jbaboval/stable-diffusion-webui.git

Jan 23 '23 17:01 jbaboval

Unfortunately it's more than a few lines of code. And getting the intel libraries and drivers setup isn't well integrated with distributions.

This is a work in progress, but it shows signs of life: https://github.com/jbaboval/stable-diffusion-webui/tree/oneapi

will it work using intel igpu?

Jan 24 '23 08:01 Nathan-dm

Try git clone -b oneapi https://github.com/jbaboval/stable-diffusion-webui.git

I'm fairly certain I have the right branch now. It has the Arcnotes.txt - so what am I doing wrong?

Launching Web UI with arguments: --medvram --precision full --no-half --ckpt /home/vidyut/AI/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt --config configs/v1-inference-xpu.yaml /home/vidyut/.local/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: warn(f"Failed to load image Python extension: {e}") 'NoneType' object has no attribute 'enable_tf32': str Traceback (most recent call last): File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/errors.py", line 29, in run code() File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/accelerator.py", line 58, in enable_tf32 return impl.enable_tf32() AttributeError: 'NoneType' object has no attribute 'enable_tf32'

2023-01-24 16:59:58,437 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpsg3pgt5w 2023-01-24 16:59:58,438 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmpsg3pgt5w/_remote_module_non_scriptable.py 2023-01-24 16:59:58,634 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it 2023-01-24 16:59:58,648 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it No module 'xformers'. Proceeding without it. Traceback (most recent call last): File "/home/vidyut/AI/TEST/stable-diffusion-webui/launch.py", line 315, in start() File "/home/vidyut/AI/TEST/stable-diffusion-webui/launch.py", line 306, in start import webui File "/home/vidyut/AI/TEST/stable-diffusion-webui/webui.py", line 13, in from modules.call_queue import wrap_queued_call, queue_lock, wrap_gradio_gpu_call File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/call_queue.py", line 7, in from modules import shared File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/shared.py", line 131, in devices.device, devices.device_interrogate, devices.device_gfpgan, devices.device_esrgan, devices.device_codeformer =
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/shared.py", line 132, in (devices.cpu if any(y in cmd_opts.use_cpu for y in [x, 'all']) else devices.get_optimal_device() for x in ['sd', 'interrogate', 'gfpgan', 'esrgan', 'codeformer']) File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/devices.py", line 29, in get_optimal_device accelerator_device = accelerator.get_device() File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/accelerator.py", line 25, in get_device return impl.get_device() AttributeError: 'NoneType' object has no attribute 'get_device'

Jan 24 '23 11:01 Vidyut

Reinstalled everything. Different error.

Launching Web UI with arguments: --medvram --precision full --no-half --ckpt /home/vidyut/AI/TEST/stable-diffusion-webui/models/Stable-diffusion/sd-v1-4.ckpt --config configs/v1-inference-xpu.yaml 'NoneType' object has no attribute 'enable_tf32': str Traceback (most recent call last): File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/errors.py", line 29, in run code() File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/accelerator.py", line 58, in enable_tf32 return impl.enable_tf32() AttributeError: 'NoneType' object has no attribute 'enable_tf32'

2023-01-24 17:38:30,868 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpnapm8vcl 2023-01-24 17:38:30,869 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmpnapm8vcl/_remote_module_non_scriptable.py 2023-01-24 17:38:31,054 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it 2023-01-24 17:38:31,075 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it No module 'xformers'. Proceeding without it. Traceback (most recent call last): File "/home/vidyut/AI/TEST/stable-diffusion-webui/launch.py", line 315, in start() File "/home/vidyut/AI/TEST/stable-diffusion-webui/launch.py", line 306, in start import webui File "/home/vidyut/AI/TEST/stable-diffusion-webui/webui.py", line 13, in from modules.call_queue import wrap_queued_call, queue_lock, wrap_gradio_gpu_call File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/call_queue.py", line 7, in from modules import shared File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/shared.py", line 131, in devices.device, devices.device_interrogate, devices.device_gfpgan, devices.device_esrgan, devices.device_codeformer =
File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/shared.py", line 132, in (devices.cpu if any(y in cmd_opts.use_cpu for y in [x, 'all']) else devices.get_optimal_device() for x in ['sd', 'interrogate', 'gfpgan', 'esrgan', 'codeformer']) File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/devices.py", line 29, in get_optimal_device accelerator_device = accelerator.get_device() File "/home/vidyut/AI/TEST/stable-diffusion-webui/modules/accelerator.py", line 25, in get_device return impl.get_device() AttributeError: 'NoneType' object has no attribute 'get_device'

At this point I'm not sure this is within my ability.

Jan 24 '23 12:01 Vidyut

It should be telling you right at the beginning that it's using OneAPI:

Launching Web UI with arguments: --config configs/v1-inference-xpu.yaml --listen
OneAPI is available
2023-01-24 08:03:28,418 - torch.distributed.nn.jit.instantiator - INFO - Created a temporary directory at /tmp/tmpsh22b93t
2023-01-24 08:03:28,419 - torch.distributed.nn.jit.instantiator - INFO - Writing /tmp/tmpsh22b93t/_remote_module_non_scriptable.py
2023-01-24 08:03:28,468 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it
2023-01-24 08:03:28,501 - root - WARNING - Pytorch pre-release version 1.13.0a0+gitb1dde16 - assuming intent to test it
No module 'xformers'. Proceeding without it.
Device is xpu

However it shouldn't crash out with an exception if it's not working. I'll have to fix that.

In the meantime you'll have to figure out how to get your OneAPI environment working before I can help with the webui. There's a section in the notes about how to validate

> python3
Python 3.9.15 (main, Nov 11 2022, 13:58:57) 
[GCC 11.2.0] :: Intel Corporation on linux
Type "help", "copyright", "credits" or "license" for more information.
Intel(R) Distribution for Python is brought to you by Intel Corporation.
Please check out: https://software.intel.com/en-us/python-distribution
>>> import torch
>>> import intel_extension_for_pytorch
[W OperatorEntry.cpp:150] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: torchvision::nms
    no debug info
  dispatch key: CPU
  previous kernel: registered at /build/intel-pytorch-extension/csrc/cpu/aten/TorchVisionNms.cpp:47
       new kernel: registered at /opt/workspace/vision/torchvision/csrc/ops/cpu/nms_kernel.cpp:112 (function registerKernel)
>>> torch.xpu.is_available()
True

Jan 24 '23 13:01 jbaboval

There's a new branch: rebase. It has a fix for the above exception (your GPU still won't work if you don't get the "OneAPI is Available" message). And it includes the latest upstream changes

Jan 24 '23 13:01 jbaboval

Contents of test.sh

#!/bin/bash . /opt/intel/oneapi/setvars.sh sycl-ls pip list|grep torch python -c 'import torch; import intel_extension_for_pytorch; print(torch.xpu.is_available())'

Result:

vidyut@saaki:~/AI/TEST/stable-diffusion-webui$ sh test.sh

:: initializing oneAPI environment ... test.sh: SH_VERSION = unknown args: Using "$@" for setvars.sh arguments: :: advisor -- latest :: ccl -- latest :: compiler -- latest :: dal -- latest :: debugger -- latest :: dev-utilities -- latest :: dnnl -- latest :: dpcpp-ct -- latest :: dpl -- latest :: embree -- latest :: inspector -- latest :: intelpython -- latest :: ipp -- latest :: ippcp -- latest :: ipp -- latest :: ispc -- latest :: mkl -- latest :: modelzoo -- latest :: modin -- latest :: mpi -- latest :: neural-compressor -- latest :: oidn -- latest :: openvkl -- latest :: ospray -- latest :: ospray_studio -- latest :: pytorch -- latest :: rkcommon -- latest :: rkutil -- latest :: tbb -- latest :: tensorflow -- latest :: vpl -- latest :: vtune -- latest :: oneAPI environment initialized ::

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2022.15.12.0.01_081451] [opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz 3.0 [2022.15.12.0.01_081451] [opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) HD Graphics 520 [0x1916] 3.0 [22.43.24595.35] [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) HD Graphics 520 [0x1916] 1.3 [1.3.24595] intel-extension-for-pytorch 1.13.10+xpu open-clip-torch 2.7.0 pytorch-lightning 1.7.6 torch 1.13.0a0+gitb1dde16 torchdiffeq 0.2.3 torchmetrics 0.11.0 torchsde 0.2.5 torchvision 0.14.1a0+0504df5 [W OperatorEntry.cpp:150] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key operator: torchvision::nms no debug info dispatch key: CPU previous kernel: registered at /build/intel-pytorch-extension/csrc/cpu/aten/TorchVisionNms.cpp:47 new kernel: registered at /opt/workspace/vision/torchvision/csrc/ops/cpu/nms_kernel.cpp:112 (function registerKernel) True

Are you passing the original suggested arguments to launch.py? Because I am, but they aren't showing in your example. Maybe that's the issue? Update: Not working without them and just passing the two you said either. The "OneAPI is available" message doesn't show.

I'm able to set up the environment as far as I can tell, but I can't get the code to run. Maybe there's a missing dependency...

Jan 24 '23 14:01 Vidyut

I've got other work I'm doing now. Will test more when I get time.

Jan 24 '23 14:01 Vidyut

Can you try running source ./venv/bin/activate in your webui tree to activate the virtual environment and then run your test script again?

I think that the problem now might be a difference between your system python environment and the venv environment.

Jan 25 '23 01:01 jbaboval

nope :(

Jan 25 '23 05:01 Vidyut

Sorry I couldn't get you working. I'm going to try and tidy this stuff up and submit it back. So hopefully you'll have better luck when it's properly integrated.

Jan 25 '23 13:01 jbaboval

could you provide installation tutorial on windows os? i would like to try it on my laptop, coz im sick of waiting my cpu to generate images. my laptop spec is i5 1135g7,16gb ddr4 ram, intel xe graphic (80cu),and intel xe max graphic (DG1)

Jan 26 '23 12:01 Nathan-dm

It cannot work in straight windows, as the pytorch extensions are linux only. Though it can work with WSL.

This is intended for intel Arc GPUs which are desktop cards that are significantly different from the integrated graphics that you have on your laptop.

Jan 26 '23 13:01 jbaboval

stable-diffusion-webui stable-diffusion-webui copied to clipboard

[Feature Request]: Support for Intel Oneapi/Vulkan versions of pytorch as well

Is there an existing issue for this?

What would your feature do ?

Proposed workflow

Additional information

stable-diffusion-webui
stable-diffusion-webui copied to clipboard