stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: Cannot reproduce results between platforms.

Open Myridium opened this issue 2 years ago • 1 comments
trafficstars

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

Generated images are inconsistent between two of my machines, even with the same model and the same generation parameters. linux-75944430 macos-75944430

Steps to reproduce the problem

The same version of stable-diffusion-webui is installed on a macOS machine and a Linux machine. The commit hash is e8a41df49fadd2cf9f23b1f02d75a4947bec5646.

I deleted the venv directory on each machine and let it reinstall again, for a clean install.

Configuration parameters between the machines are identical, including:

  • The seed.
  • The CFG.
  • The number of iterations.
  • The sampling method.
  • The image pixel dimensions.
  • The diffusion model (I confirmed that the model hashes match).
  • The VAE (set to None).

I ran exiftool on the two images attached, and here is the output from each.

From the image generated on macOS:

$ exiftool macos-75944430.png
ExifTool Version Number         : 12.50
File Name                       : macos-75944430.png
Directory                       : .
File Size                       : 410 kB
File Modification Date/Time     : 2023:01:28 19:56:43+11:00
File Access Date/Time           : 2023:01:28 19:56:45+11:00
File Inode Change Date/Time     : 2023:01:28 19:56:43+11:00
File Permissions                : -rw-r--r--
File Type                       : PNG
File Type Extension             : png
MIME Type                       : image/png
Image Width                     : 512
Image Height                    : 512
Bit Depth                       : 8
Color Type                      : RGB
Compression                     : Deflate/Inflate
Filter                          : Adaptive
Interlace                       : Noninterlaced
Parameters                      : watermelon.Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 75944430, Size: 512x512, Model hash: dcd690123c, Model: v2-1_768-ema-pruned
Image Size                      : 512x512
Megapixels                      : 0.262

From the image generated on linux:

$ exiftool linux-75944430.png
ExifTool Version Number         : 12.50
File Name                       : linux-75944430.png
Directory                       : .
File Size                       : 532 kB
File Modification Date/Time     : 2023:01:28 19:56:20+11:00
File Access Date/Time           : 2023:01:28 19:56:22+11:00
File Inode Change Date/Time     : 2023:01:28 19:56:20+11:00
File Permissions                : -rw-r--r--
File Type                       : PNG
File Type Extension             : png
MIME Type                       : image/png
Image Width                     : 512
Image Height                    : 512
Bit Depth                       : 8
Color Type                      : RGB
Compression                     : Deflate/Inflate
Filter                          : Adaptive
Interlace                       : Noninterlaced
Parameters                      : watermelon.Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 75944430, Size: 512x512, Model hash: dcd690123c, Model: v2-1_768-ema-pruned
Image Size                      : 512x512
Megapixels                      : 0.262

This is just one example. In general, the images never match between these machines.

What should have happened?

The produced images should be identical. Maybe there could be some slight difference due to machine architecture and floating point precision, but by eye it seems like the sampler is doing something different.

Commit where the problem happens

e8a41df49fadd2cf9f23b1f02d75a4947bec5646

What platforms do you use to access the UI ?

Linux, MacOS

What browsers do you use to access the UI ?

Apple Safari

Command Line Arguments

Yes.

MacOS: `export COMMANDLINE_ARGS="--opt-split-attention-invokeai --skip-torch-cuda-test --no-half --use-cpu interrogate"`

Linux: `export COMMANDLINE_ARGS="--listen --enable-insecure-extension-access"

List of extensions

Updated extensions to the same version. No difference.

Active extensions:

  • sd-dynamic-prompts
  • shift-attention
  • stable-diffusion-webui-sonar
  • stable-diffusion-webui-tokenizer

Console logs

Console log from `macOS`:



################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on myuser user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
==============================================================================
You are running torch 1.12.1.
The program is tested to work with torch 1.13.1.
To reinstall the desired version, run with commandline flag --reinstall-torch.
Beware that this will cause a lot of large files to be downloaded.
==============================================================================
INFO:dynamic_prompting.py:Prompt matrix will create 1 images in a total of 1 batches.
Python 3.10.0 (default, Mar  3 2022, 03:57:21) [Clang 12.0.0 ]
Commit hash: e8a41df49fadd2cf9f23b1f02d75a4947bec5646
Installing requirements for Web UI

Installing sd-dynamic-prompts requirements.txt

Launching Web UI with arguments: --opt-split-attention-invokeai --no-half --use-cpu interrogate
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
No module 'xformers'. Proceeding without it.
Loading weights [dcd690123c] from /Users/myuser/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.safetensors
Creating model from config: /Users/myuser/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.yaml
LatentDiffusion: Running in v-prediction mode
DiffusionWrapper has 865.91 M params.
Applying cross attention optimization (InvokeAI).
Textual inversion embeddings loaded(0): 
Model loaded in 13.8s (load weights from disk: 0.1s, create model: 0.3s, apply weights to model: 11.7s, move model to device: 1.5s).
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

  0%|          | 0/20 [00:00<?, ?it/s]

Total progress:   0%|          | 0/20 [00:00<?, ?it/s][A
  5%|▌         | 1/20 [00:04<01:25,  4.47s/it]

Total progress:  10%|█         | 2/20 [00:01<00:17,  1.04it/s][A
 10%|█         | 2/20 [00:06<00:53,  2.96s/it]

Total progress:  15%|█▌        | 3/20 [00:02<00:16,  1.00it/s][A
 15%|█▌        | 3/20 [00:07<00:35,  2.08s/it]

Total progress:  20%|██        | 4/20 [00:03<00:15,  1.01it/s][A
 20%|██        | 4/20 [00:08<00:26,  1.64s/it]

Total progress:  25%|██▌       | 5/20 [00:04<00:14,  1.07it/s][A
 25%|██▌       | 5/20 [00:09<00:20,  1.35s/it]

Total progress:  30%|███       | 6/20 [00:05<00:12,  1.11it/s][A
 30%|███       | 6/20 [00:10<00:16,  1.18s/it]

Total progress:  35%|███▌      | 7/20 [00:06<00:11,  1.13it/s][A
 35%|███▌      | 7/20 [00:10<00:13,  1.07s/it]

Total progress:  40%|████      | 8/20 [00:07<00:10,  1.14it/s][A
 40%|████      | 8/20 [00:11<00:12,  1.00s/it]

Total progress:  45%|████▌     | 9/20 [00:08<00:09,  1.13it/s][A
 45%|████▌     | 9/20 [00:12<00:10,  1.03it/s]

Total progress:  50%|█████     | 10/20 [00:09<00:08,  1.13it/s][A
 50%|█████     | 10/20 [00:13<00:09,  1.06it/s]

Total progress:  55%|█████▌    | 11/20 [00:09<00:07,  1.13it/s][A
 55%|█████▌    | 11/20 [00:14<00:08,  1.08it/s]

Total progress:  60%|██████    | 12/20 [00:10<00:07,  1.13it/s][A
 60%|██████    | 12/20 [00:15<00:07,  1.10it/s]

Total progress:  65%|██████▌   | 13/20 [00:11<00:06,  1.15it/s][A
 65%|██████▌   | 13/20 [00:16<00:06,  1.12it/s]

Total progress:  70%|███████   | 14/20 [00:12<00:05,  1.14it/s][A
 70%|███████   | 14/20 [00:17<00:05,  1.12it/s]

Total progress:  75%|███████▌  | 15/20 [00:13<00:04,  1.15it/s][A
 75%|███████▌  | 15/20 [00:17<00:04,  1.14it/s]

Total progress:  80%|████████  | 16/20 [00:14<00:03,  1.14it/s][A
 80%|████████  | 16/20 [00:18<00:03,  1.13it/s]

Total progress:  85%|████████▌ | 17/20 [00:15<00:02,  1.16it/s][A
 85%|████████▌ | 17/20 [00:19<00:02,  1.15it/s]

Total progress:  90%|█████████ | 18/20 [00:16<00:01,  1.17it/s][A
 90%|█████████ | 18/20 [00:20<00:01,  1.16it/s]

Total progress:  95%|█████████▌| 19/20 [00:16<00:00,  1.16it/s][A
 95%|█████████▌| 19/20 [00:21<00:00,  1.16it/s]

Total progress: 100%|██████████| 20/20 [00:17<00:00,  1.18it/s][A
100%|██████████| 20/20 [00:22<00:00,  1.17it/s]
100%|██████████| 20/20 [00:22<00:00,  1.11s/it]

Total progress: 100%|██████████| 20/20 [00:18<00:00,  1.07it/s]/Users/myuser/opt/anaconda3/envs/ml/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Console log from linux:


################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on myuser user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
Create and activate python venv
################################################################

################################################################
Launching launch.py...
################################################################
INFO:dynamic_prompting.py:Prompt matrix will create 1 images in a total of 1 batches.
Python 3.10.9 (main, Dec 19 2022, 17:35:49) [GCC 12.2.0]
Commit hash: e8a41df49fadd2cf9f23b1f02d75a4947bec5646
Installing requirements for Web UI
Installing sd-dynamic-prompts requirements.txt


Launching Web UI with arguments: --listen --enable-insecure-extension-access
No module 'xformers'. Proceeding without it.
Loading weights [dcd690123c] from /home/myuser/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.safetensors
Creating model from config: /home/myuser/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.yaml
LatentDiffusion: Running in v-prediction mode
DiffusionWrapper has 865.91 M params.
Applying cross attention optimization (Doggettx).
Textual inversion embeddings loaded(0): 
Model loaded in 2.4s (create model: 0.3s, apply weights to model: 0.7s, apply half(): 0.4s, load VAE: 0.8s, move model to device: 0.3s).
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.

  0%|          | 0/20 [00:00<?, ?it/s]

Total progress:   0%|          | 0/20 [00:00<?, ?it/s][A
  5%|▌         | 1/20 [00:00<00:18,  1.04it/s]

Total progress:  10%|█         | 2/20 [00:00<00:01, 16.39it/s][A
 10%|█         | 2/20 [00:01<00:08,  2.14it/s]
 15%|█▌        | 3/20 [00:01<00:05,  3.24it/s]

Total progress:  20%|██        | 4/20 [00:00<00:01, 10.39it/s][A
 20%|██        | 4/20 [00:01<00:03,  4.26it/s]
 25%|██▌       | 5/20 [00:01<00:03,  4.12it/s]

Total progress:  30%|███       | 6/20 [00:00<00:01,  7.13it/s][A
 30%|███       | 6/20 [00:01<00:02,  4.91it/s]

Total progress:  35%|███▌      | 7/20 [00:00<00:01,  7.30it/s][A
 35%|███▌      | 7/20 [00:01<00:02,  5.60it/s]

Total progress:  40%|████      | 8/20 [00:01<00:01,  7.40it/s][A
 40%|████      | 8/20 [00:01<00:01,  6.13it/s]

Total progress:  45%|████▌     | 9/20 [00:01<00:01,  7.49it/s][A
 45%|████▌     | 9/20 [00:02<00:01,  6.56it/s]

Total progress:  50%|█████     | 10/20 [00:01<00:01,  7.60it/s][A
 50%|█████     | 10/20 [00:02<00:01,  6.92it/s]

Total progress:  55%|█████▌    | 11/20 [00:01<00:01,  6.84it/s][A
 55%|█████▌    | 11/20 [00:02<00:01,  6.35it/s]

Total progress:  60%|██████    | 12/20 [00:01<00:01,  6.10it/s][A
 60%|██████    | 12/20 [00:02<00:01,  5.84it/s]

Total progress:  65%|██████▌   | 13/20 [00:01<00:01,  6.61it/s][A
 65%|██████▌   | 13/20 [00:02<00:01,  6.41it/s]

Total progress:  70%|███████   | 14/20 [00:01<00:00,  7.03it/s][A
 70%|███████   | 14/20 [00:02<00:00,  6.88it/s]

Total progress:  75%|███████▌  | 15/20 [00:02<00:00,  7.24it/s][A
 75%|███████▌  | 15/20 [00:02<00:00,  7.13it/s]

Total progress:  80%|████████  | 16/20 [00:02<00:00,  7.51it/s][A
 80%|████████  | 16/20 [00:03<00:00,  7.43it/s]

Total progress:  85%|████████▌ | 17/20 [00:02<00:00,  7.72it/s][A
 85%|████████▌ | 17/20 [00:03<00:00,  7.66it/s]

Total progress:  90%|█████████ | 18/20 [00:02<00:00,  6.03it/s][A
 90%|█████████ | 18/20 [00:03<00:00,  5.99it/s]

Total progress:  95%|█████████▌| 19/20 [00:02<00:00,  6.56it/s][A
 95%|█████████▌| 19/20 [00:03<00:00,  6.54it/s]

Total progress: 100%|██████████| 20/20 [00:02<00:00,  7.00it/s][A
100%|██████████| 20/20 [00:03<00:00,  6.98it/s]
100%|██████████| 20/20 [00:03<00:00,  5.38it/s]

Total progress: 100%|██████████| 20/20 [00:02<00:00,  6.83it/s]


### Additional information

I have tried several samplers, and always the answer is different between machines.

Myridium avatar Jan 28 '23 09:01 Myridium

--no-half means that operations usually done at half precision will be at full precision on mac. I think that's the key difference. If you want to reproduce the image, try using --no-half for linux too. But that will be much slower, of course.

Moreover, cross-attention optimization is different between Mac (InvokeAI) and usual one (DoggetX) it causes more inconsistency. And the last one, of which i'm not certain, but mps and cuda may process some parts differently too.

All in all, we can't do anything about it.

mezotaken avatar Jan 28 '23 09:01 mezotaken

I see the exact same problem using a Mac Studio vs Google colabs. Same version of automatic1111 in both, same model, same parameters. Different results. I would also argue that the results in the mac are usually not as good. Have you notice any quality difference between the two platforms?

janigro avatar Feb 01 '23 19:02 janigro

I see the exact same problem using a Mac Studio vs Google colabs. Same version of automatic1111 in both, same model, same parameters. Different results. I would also argue that the results in the mac are usually not as good. Have you notice any quality difference between the two platforms?

I don't know about quality difference. But the images are totally different. It's not just minor alterations, it's like entirely different images. And there's definitely some consistent difference between the way images look on either platform. It's not like it's using different seeds, it's more like the sampler behaves totally differently. I don't know whether one is better or not.

Myridium avatar Feb 02 '23 01:02 Myridium

Well, this is definetely not an Automatic1111 issue. I just tried with InvokeAI using the same configuration.

Both colab images (1111 and invoke) look very similar, and both mac images (1111 and invoke) look similar too. But the images between colab and mac are very different.

Prompt: a red sport car, winding road, golden hour Model: v1-5-pruned-emaonly.ckpt Seed: 2023 Sampler: Euler a Steps: 20 CFG scale: 7

diff-demo

Here's a gif with all the results. I launched automatic1111 with --no-half in both platforms.

Finally, what I said about quality, it was just a subjective impression. I would need a larger set of results for a proper judgment.

janigro avatar Feb 02 '23 09:02 janigro

I ran into the same problem.

It might be due to the randn() function to generate random numbers. https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/master/modules/devices.py#L94 If device is mps, it generates the same numbers as cpu. But when device is cuda, it generates different numbers.

This always generates the same numbers as cpu's regardless of device.

def randn(seed, shape):
    # from modules.shared import opts

    torch.manual_seed(seed)
    # if opts.randn_source == "CPU" or device.type == 'mps':
    #     return torch.randn(shape, device=cpu).to(device)
    # return torch.randn(shape, device=device)
    return torch.randn(shape, device=cpu).to(device)

After this patch, my m1 Mac and Google Colab notebook generated exactly the same images if prompts and seed were same. Unfortunately there seems to be no way for mps to make the same image that is generated by the default code in cuda-environment. I don't know how this patch influences the performance, but it didn't seem very slow when I tried on Colab.

EbaraKoji avatar Jun 15 '23 12:06 EbaraKoji