tomesd
tomesd copied to clipboard
directml
Hi I tried running this using https://github.com/lshqqytiger/stable-diffusion-webui-directml Also this extension https://git.mmaker.moe/mmaker/sd-webui-tome I get errors when I turn on ToMe is this related to using torch:1.13.1 or is it a problem with directml? AMD video cards are very slow, my vega 56 is 7 times slower than rtx3060 Your work would be very useful for AMD owners https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/61
venv "D:\neiro\last\stable-diffusion-webui-directml\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: ae337fa39b6d4598b377ff312c53b14c15142331
Installing requirements for Web UI
Launching Web UI with arguments: --medvram --disable-nan-check --autolaunch --opt-split-attention-invokeai --opt-sub-quad-attention --theme dark --no-half --precision full --no-half-vae --ckpt-dir D:\neiro\AMD\stable-diffusion-webui-directml
Warning: experimental graphic memory optimization is disabled due to gpu vendor. Currently this optimization is only available for AMDGPUs.
Disabled experimental graphic memory optimizations.
Interrogations are fallen back to cpu. This doesn't affect on image generation. But if you want to use interrogate (CLIP or DeepBooru), check out this issue: https://github.com/lshqqytiger/stable-diffusion-webui-directml/issues/10
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
No module 'xformers'. Proceeding without it.
Loading weights [2085909b28] from D:\neiro\AMD\stable-diffusion-webui-directml\models\Stable-diffusion\donkoMix_donkoMix.safetensors
Creating model from config: D:\neiro\last\stable-diffusion-webui-directml\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading VAE weights specified in settings: D:\neiro\AMD\stable-diffusion-webui-directml\models\VAE\novelai.vae.pt
Applying sub-quadratic cross attention optimization.
Textual inversion embeddings loaded(0):
Applying ToMe patch...
ToMe patch applied
Model loaded in 1.9s (load weights from disk: 0.2s, create model: 0.5s, apply weights to model: 0.6s, load VAE: 0.5s).
Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True
in launch()
.
Startup time: 61.7s (import torch: 1.7s, import gradio: 1.2s, import ldm: 0.5s, other imports: 2.3s, list SD models: 42.8s, load scripts: 1.3s, refresh VAE: 2.0s, load SD checkpoint: 2.0s, create ui: 7.5s, gradio launch: 0.3s).
0%| | 0/26 [00:02<?, ?it/s]
Error completing request
Arguments: ('task(grxvqmflcpyiis9)', '1girl', '(worst quality, low quality:1.4), (monochrome), zombie,badv3, badhandv4', [], 26, 15, False, False, 1, 1, 6, 11691188.0, -1.0, 0, 0, 0, False, 896, 896, False, 0.7, 2, 'Latent', 0, 0, 0, [], 0, False, 'MultiDiffusion', False, 10, 1, 1, 64, False, True, 1024, 1024, 96, 96, 48, 1, 'None', 2, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, False, True, True, False, 512, 64, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0) {}
Traceback (most recent call last):
File "D:\neiro\last\stable-diffusion-webui-directml\modules\call_queue.py", line 56, in f
res = list(func(*args, **kwargs))
File "D:\neiro\last\stable-diffusion-webui-directml\modules\call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "D:\neiro\last\stable-diffusion-webui-directml\modules\txt2img.py", line 56, in txt2img
processed = process_images(p)
File "D:\neiro\last\stable-diffusion-webui-directml\modules\processing.py", line 503, in process_images
res = process_images_inner(p)
File "D:\neiro\last\stable-diffusion-webui-directml\modules\processing.py", line 653, in process_images_inner
samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
File "D:\neiro\last\stable-diffusion-webui-directml\modules\processing.py", line 869, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "D:\neiro\last\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 358, in sample
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args={
File "D:\neiro\last\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 234, in launch_sampling
return func()
File "D:\neiro\last\stable-diffusion-webui-directml\modules\sd_samplers_kdiffusion.py", line 358, in
Seems to be an issue with the requirements of gather, similar to M1 Macs (#4).
Are you able to get any more information than just "RuntimeError"? It would be useful to know if this is the same issue.
what information do you need? I will try to provide everything that you need within the limits of my skills.
Ah, nvm I found it: https://learn.microsoft.com/en-us/windows/win32/api/directml/ns-directml-dml_gather_operator_desc
It is indeed the same issue:
IndicesTensor Type: const DML_TENSOR_DESC A tensor containing the indices. The DimensionCount of this tensor must match InputTensor.DimensionCount.
Interesting that multiple libraries have this very restrictive stipulations on their gather operations. If I can find a way to reproduce this error, I might try to create a version of the function without these gathers (that might be slower, but better than nothing).
Edit: on second thought, that might just mean the number of dimensions have to be the same. Still seems to be an issue with gather though. I'll see if I can reproduce it.
ok, we will wait and believe in you :thumbsup:
You'd better use ROCM in linux,this will be much faster than using directML, and the memory management is better
i used linux but didn't see any noticeable speed increase, for me it's only 20% faster I stopped using Linux when it became possible to use directml, as it is much more convenient
I've been having the same issue, which is a shame, because as stated, AMD could really use this boost.
Traceback (most recent call last):
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\call_queue.py", line 56, in f
res = list(func(*args, **kwargs))
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\call_queue.py", line 37, in f
res = func(*args, **kwargs)
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\txt2img.py", line 56, in txt2img
processed = process_images(p)
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\processing.py", line 504, in process_images
res = process_images_inner(p)
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\processing.py", line 654, in process_images_inner
samples_ddim = p.sample(conditioning=c, unconditional_conditioning=uc, seeds=seeds, subseeds=subseeds, subseed_strength=p.subseed_strength, prompts=prompts)
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\processing.py", line 870, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\sd_samplers_compvis.py", line 218, in sample
samples_ddim = self.launch_sampling(steps, lambda: self.sampler.sample(S=steps, conditioning=conditioning, batch_size=int(x.shape[0]), shape=x[0].shape, verbose=False, unconditional_guidance_scale=p.cfg_scale, unconditional_conditioning=unconditional_conditioning, x_T=x, eta=self.eta)[0])
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\sd_samplers_compvis.py", line 51, in launch_sampling
return func()
File "D:\AI.stablediffusion\stable-diffusion-webui-directml\modules\sd_samplers_compvis.py", line 218, in
Oh, and for some odd reason, it even happens if I have ToMe unchecked in the UI. As long as the merging ratio is above 0, I get that error.