Cross-Vendor Multi-GPU Support via Vulkan Backend
I realise that this is a far stretch, but a vulkan backend which would allow for multi-gpu inference would be awesome. With new models like Flux, inference requires more than 12gb at fp8. Vulkan would solve cross nvidia-AMD setups and had been implemented in llama.cpp (would be different, but its still torch). I got llama inference working on my 3060 12gb rx 6750xt setup with vulkan, so it would be great if something like this was possible.
While similar requests have been brought up in the past, it was unclear to me whether or not they were meant for multi-gpu inference for singular images in the interest of vram.
LLVM-IREE ( https://github.com/iree-org/iree ) has a vulkan backend functional for pytorch, onnx, jax. They have a working stable diffusion example: https://github.com/nod-ai/SHARK
Note nod-ai belongs to AMD now.
My understanding is that they are also planning to support Ryzen AI, it's another backend for IREE that's in the works: https://github.com/nod-ai/iree-amd-aie
I recently tried SD1.5 (not SDXL) on vulkan using kobold cpp as a test, it was fast enough, a (little) bit slower than CPU(since its an iGPU (intel 620HD)), but overall it saved time since I was still able to browse the internet without any hanging up or crashes. I would really love it on comfyUI, it might help supporting all kinds of GPU's which support vulkan.
Yeah, vulkan is awesome. LLVM-IREE seems promising, but integrated vulkan backend to comfyui would provide better results because of libs available with comfyui.
Using Vulkan would be great, ROCM works nice, but their KFD breaks too easy, making whole AMDGPU operation unusable unless you reload the module.
I'll second that Vulkan would be be amazing even if potentially less performant. Vulkan seems to be simpler to setup than juggling ROCM or cuda versions, and should work on many more vendors, potentially even on mobile vie termux or such. If LLMs can work on it SD should also be able to work :)
Vulkan would be a really good alternative for a backend that just works on AMD, Intel and Nvidia GPUs. It would extend compatibility and it'd give an option for people who don't want the huge CUDA installation or want one installation they can use with GPUs of different vendors. The problem is probably Torch not officially supporting Vulkan, though people are working on it, I would really hope for this to be a feature soon, even if experimental: https://docs.pytorch.org/tutorials/prototype/vulkan_workflow.html I don't know how much work it'd take to implement this.
Vulkan support will also allow FreeBSD users to run GPU accelerated workloads without having to use the linuxulator since NVIDIA doesn't provide CUDA on FreeBSD. Assuming of course we can even get ComfyUI to run on FreeBSD (most of the dependencies work).
+1 for Vulkan, it will enable everyone with iGPUs from AMD to use comfyUI. Vulkan at the moment is 2x faster(tg and pp) than ROCm for LLM inferencing in the cases with AMD iGPUs under Linux.
+1
Another +1 since Rocm is so shitty. On LLM's Vulkan is already faster than Rocm!
Everyone agrees that implementing Vulkan is a good thing; for two years now, everyone has been nodding their heads in agreement. :')