stable-diffusion-webui
stable-diffusion-webui copied to clipboard
Non-deterministic across different batch sizes
Is there an existing issue for this?
- [X] I have searched the existing issues and checked the recent builds/commits
What happened?
To start, I have experienced this on both Apple silicon and an AMD GPU (7900XTX) on Ubuntu.
I searched for this issue already and came across this report, but it was closed as fixed back in February: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/5210
I have generated two images using the DPM++ SDE Karras sampler and I've gotten different results for the same prompt and settings depending on the size of the batches that I run. If I run with batch size 1 and batch size 2 with the same seed, the image appears mostly similar but has subtle differences and a different sha256 hash. Here is one example (I recommend opening both in new tabs and cycling back and forth):
dog, autumn in paris, ornate, beautiful, atmosphere, vibe, mist, smoke, fire, chimney, rain, wet, pristine, puddles, melting, dripping, snow, creek, lush, ice, bridge, forest, roses, flowers, by stanley artgerm lau, greg rutkowski, thomas kindkade, alphonse mucha, loish, norman rockwell
Steps: 20, Sampler: DPM++ SDE Karras, CFG scale: 7, Seed: 1543893233, Size: 512x512, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly, Clip skip: 2, Version: v1.3.0
Steps to reproduce the problem
- Generate image with batch size = 1 (batch count does not matter)
- Recycle seed
- Generate another image with batch size > 1 (batch count does not matter)
- Images generated are similar, but different
What should have happened?
Both images should be identical regardless of batch size
Commit where the problem happens
20ae71faa8ef035c31aa3a410b707d792c8203a3
What Python version are you running on ?
Python 3.10.x
What platforms do you use to access the UI ?
MacOS
What device are you running WebUI on?
Other GPUs
What browsers do you use to access the UI ?
Mozilla Firefox
Command Line Arguments
--skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate
List of extensions
No
Console logs
################################################################
Launching launch.py...
################################################################
Python 3.10.11 (main, Apr 7 2023, 07:24:47) [Clang 14.0.0 (clang-1400.0.29.202)]
Version: v1.3.0
Commit hash: 20ae71faa8ef035c31aa3a410b707d792c8203a3
Installing requirements
Launching Web UI with arguments: --skip-torch-cuda-test --upcast-sampling --no-half-vae --use-cpu interrogate
No module 'xformers'. Proceeding without it.
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
Loading weights [6ce0161689] from /Users/myuser/dev/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors
Creating model from config: /Users/myuser/dev/stable-diffusion-webui/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
Startup time: 5.3s (import torch: 1.3s, import gradio: 1.2s, import ldm: 0.3s, other imports: 1.6s, load scripts: 0.4s, create ui: 0.4s, gradio launch: 0.1s).
DiffusionWrapper has 859.52 M params.
Applying optimization: InvokeAI... done.
Textual inversion embeddings loaded(0):
Model loaded in 8.3s (load weights from disk: 0.5s, create model: 0.6s, apply weights to model: 5.2s, apply half(): 1.1s, move model to device: 0.9s).
100%|███████████████████████████████████████████| 20/20 [00:18<00:00, 1.08it/s]
Total progress: 100%|███████████████████████████| 20/20 [00:17<00:00, 1.15it/s]
100%|███████████████████████████████████████████| 20/20 [00:17<00:00, 1.12it/s]
Total progress: 100%|███████████████████████████| 20/20 [00:16<00:00, 1.19it/s]
25%|███████████ | 5/20 [00:04<00:13, 1.13it/s]
Total progress: 25%|███████ | 5/20 [00:04<00:13, 1.14it/s]
100%|███████████████████████████████████████████| 20/20 [00:16<00:00, 1.20it/s]
Total progress: 100%|███████████████████████████| 20/20 [00:16<00:00, 1.21it/s]
100%|███████████████████████████████████████████| 20/20 [00:16<00:00, 1.19it/s]
100%|███████████████████████████████████████████| 20/20 [00:16<00:00, 1.20it/s]
Total progress: 100%|███████████████████████████| 40/40 [00:33<00:00, 1.18it/s]
100%|███████████████████████████████████████████| 20/20 [00:38<00:00, 1.91s/it]
Total progress: 100%|███████████████████████████| 20/20 [00:37<00:00, 1.89s/it]
Total progress: 100%|███████████████████████████| 20/20 [00:37<00:00, 1.87s/it]
Additional information
I do not have the compatibility setting checked that would make things non-deterministic
The same goes for other samplers.
Doesn't it increment the Seed by default in a batch?
ie. check if the second image generated === the output from the same generation settings with Seed + 1
@AlphaJuliettOmega Yes, but if you recycle the seed then it'll reuse the same seed for the first image in the batch. You can download those two images and see that they have the same seed with PNG Info (assuming github doesn't strip metadata). Even a change of 1 in the seed will change the image drastically (which is why batching is useful when you want to generate a lot of images with one prompt).
if my observation is correct it is not non-deterministic however the result is different if you would run the images individually versus in the batch size
the easiest way is to run images in batch count 2 then batch size 2 and compare the results you should see there are different but if you would compare the outcome of multiple runs of that size to size or batch count to count they will be the same
the results of batch count is the same as the result you would get if you run the images individually
@w-e-w Yeah that's correct. Might be better to say non-deterministic across different batch sizes. But still I would expect the same inputs to give the same outputs regardless of batch size.
The same goes for other samplers. Not sure if it has room for improvement.
I can confirm this issue is present (on a windows machine) but using the DPM++ 2M Karras sampler. Generating an image as part of a batch, and generating only it using its seed, with a batch size of 1 gives important differences in the end image.
I think for SDE this was fixed in february's commit with the option: "Do not make DPM++ SDE deterministic across different batch sizes."
But it also affects DPM++ 2M Karras
Could reproduce issue with Heun sampler as well this was not an issue in <1.6 version. Command line arguments:
COMMANDLINE_ARGS= --no-half --opt-sdp-attention --medvram
this should not be labeled platform mac
👍 Also experiencing this on Windows using DPM++ 2M SDE Heun Karras
Also experiencing this on windows with 2m sde karras, on a 2060 card
After experiencing the same thing, I found this github issue and am also currently experiencing this phenomenon on an Apple M2 Max Macbook Pro running Sonoma 14.1.2 with automatic1111 1.7.0 (checkpoint 53b98bb91a) and with samplers Euler, DPM++ SDE, and DPM++ 3M SDE Karras.
I too have the "Do not make deterministic" setting unchecked, and have checked the seed breaking changes guide, and my installation of automatic1111 1.7.0 is stock as far as the macOS instructions go.
I'm having the same issue using DDPM Karras (using Forge on Win 10). I tried changing the various settings noted in this thread, and I also tried the extension, but none of it helped, unfortunately.