stable-diffusion-webui [Feature Request]: Support for new 2.0 models | 768x768 resolution + new 512x512 + depth + inpainting

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Support the new 768x768 model 2.0 from Stability-AI and all the other new models that just got released.

Proposed workflow

Go to Stable Diffusion Checkpoint selector
Select 768-v-ema.ckpt from the list
Create images just like with any other model
Extra ¨nice to have¨ option: set the resolution to 768x768 automatically when loading this model
Add support for the new 512x512 models: base + inpainting + depth
Add support for the new x4 upscaler model

Links

https://huggingface.co/stabilityai/stable-diffusion-2 https://huggingface.co/stabilityai/stable-diffusion-2-base https://huggingface.co/stabilityai/stable-diffusion-2-depth https://huggingface.co/stabilityai/stable-diffusion-2-inpainting https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler/tree/main

768 model download link on HuggingFace: https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/768-v-ema.ckpt 512 base model download link: https://huggingface.co/stabilityai/stable-diffusion-2-base/blob/main/512-base-ema.ckpt 512 depth model download link: https://huggingface.co/stabilityai/stable-diffusion-2-depth/blob/main/512-depth-ema.ckpt 512 inpainting model download link: https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/blob/main/512-inpainting-ema.ckpt new x4 upscaler download link: https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler/blob/main/x4-upscaler-ema.ckpt

Additional information

Here is the error message you get when trying to load the 768x768 2.0 model with the current release:

Traceback (most recent call last):
  File "C:\stable-diffusion-webui-master\venv\lib\site-packages\gradio\routes.py", line 284, in run_predict
    output = await app.blocks.process_api(
  File "C:\stable-diffusion-webui-master\venv\lib\site-packages\gradio\blocks.py", line 982, in process_api
    result = await self.call_function(fn_index, inputs, iterator)
  File "C:\stable-diffusion-webui-master\venv\lib\site-packages\gradio\blocks.py", line 824, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\stable-diffusion-webui-master\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\stable-diffusion-webui-master\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "C:\stable-diffusion-webui-master\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "C:\stable-diffusion-webui-master\modules\ui.py", line 1662, in <lambda>
    fn=lambda value, k=k: run_settings_single(value, key=k),
  File "C:\stable-diffusion-webui-master\modules\ui.py", line 1504, in run_settings_single
    opts.data_labels[key].onchange()
  File "C:\stable-diffusion-webui-master\webui.py", line 41, in f
    res = func(*args, **kwargs)
  File "C:\stable-diffusion-webui-master\webui.py", line 83, in <lambda>
    shared.opts.onchange("sd_model_checkpoint", wrap_queued_call(lambda: modules.sd_models.reload_model_weights()))
  File "C:\stable-diffusion-webui-master\modules\sd_models.py", line 291, in reload_model_weights
    load_model_weights(sd_model, checkpoint_info)
  File "C:\stable-diffusion-webui-master\modules\sd_models.py", line 182, in load_model_weights
    model.load_state_dict(sd, strict=False)
  File "C:\stable-diffusion-webui-master\venv\lib\site-packages\torch\nn\modules\module.py", line 1604, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LatentDiffusion:
        size mismatch for model.diffusion_model.input_blocks.1.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.input_blocks.1.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.input_blocks.1.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.2.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.input_blocks.2.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.input_blocks.2.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.4.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.input_blocks.4.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.input_blocks.4.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.5.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.input_blocks.5.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.input_blocks.5.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.7.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.input_blocks.7.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.input_blocks.7.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.8.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.input_blocks.8.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.input_blocks.8.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.middle_block.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.middle_block.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.middle_block.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.3.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.output_blocks.3.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.output_blocks.3.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.4.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.output_blocks.4.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.output_blocks.4.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.5.1.proj_in.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.output_blocks.5.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([1280, 1024]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for model.diffusion_model.output_blocks.5.1.proj_out.weight: copying a param with shape torch.Size([1280, 1280]) from checkpoint, the shape in current model is torch.Size([1280, 1280, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.6.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.output_blocks.6.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.7.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.output_blocks.7.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.output_blocks.7.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.8.1.proj_in.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.output_blocks.8.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([640, 1024]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for model.diffusion_model.output_blocks.8.1.proj_out.weight: copying a param with shape torch.Size([640, 640]) from checkpoint, the shape in current model is torch.Size([640, 640, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.9.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.output_blocks.9.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.output_blocks.9.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.10.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.output_blocks.10.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.output_blocks.10.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.11.1.proj_in.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).
        size mismatch for model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_k.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.output_blocks.11.1.transformer_blocks.0.attn2.to_v.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for model.diffusion_model.output_blocks.11.1.proj_out.weight: copying a param with shape torch.Size([320, 320]) from checkpoint, the shape in current model is torch.Size([320, 320, 1, 1]).

Nov 24 '22 02:11 AugmentedRealityCat

To help those who are ready to take on: https://twitter.com/RiversHaveWings/status/1595596524431773697 https://github.com/crowsonkb/k-diffusion/commit/4314f9101a2f3bd7f11ba4290d2a7e2e64b4ceea As far as I understand, we only need to use this wrapper if we are working with 2.0 models

Nov 24 '22 02:11 1ort

Semi-Related:

https://github.com/Stability-AI/stablediffusion/issues/4
https://github.com/Stability-AI/stablediffusion/issues/9
https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/5009 [dupe]
https://github.com/huggingface/diffusers/issues/1388
https://github.com/huggingface/diffusers/issues/1392
https://github.com/db0/AI-Horde/issues/86
https://github.com/JoePenna/Dreambooth-Stable-Diffusion/issues/112
https://github.com/TheLastBen/fast-stable-diffusion/issues/599
https://github.com/ShivamShrirao/diffusers/issues/143
https://github.com/Sygil-Dev/nataili/issues/67
https://github.com/Sygil-Dev/sygil-webui/issues/1686

Nov 24 '22 04:11 0xdevalias

Trying to use the wrapper here but realized that the model loader is not even getting there, the model weights are still using the v1 512, 512 torch sizes and the new model has 4 dimensions

To help those who are ready to take on: https://twitter.com/RiversHaveWings/status/1595596524431773697 crowsonkb/k-diffusion@4314f91 As far as I understand, we only need to use this wrapper if we are working with 2.0 models

Nov 24 '22 04:11 nachoal

https://github.com/MrCheeze/stable-diffusion-webui/commit/069591b06bbbdb21624d489f3723b5f19468888d

Nov 24 '22 05:11 152334H

Anyone on linux (and likely mac) that just want to try it, a few things I found:

I highly recommend cloning v2 to a new folder for the moment if you just want to try it!

Most likely any type of stuff like old models, hypernetworks, VAEs, and most likely embeddings, and other scripts - will be broken until better support is done.
I have no idea what the state of model merging, training, upscaling, etc. are like - but many are likely broken
I was unable to load the MrCheeze's repo with old models in models/Stable-diffusion
I think memory optimizations are also broken right now
The prompting language has changed - so you'll need to learn how to craft prompts slightly differently again.
The old decoders seem to work fine, and things like prompt weighting work as well.
I get an error about AttributeError: 'FrozenOpenCLIPEmbedder' object has no attribute 'process_text' but it seems to be working anyway, I'm not sure exactly what that's about. EDIT: This appears to be related to getting the token count for the GUI, but I don't think this affects generation
On initial tests it seems to do a lot better at 768x768 then it does 512x512 but this isn't well tested

git clone https://github.com/MrCheeze/stable-diffusion-webui.git stable-diffusion-v2
cd stable-diffusion-v2
git checkout sd-2.0 # I tested commit 069591b06bbbdb21624d489f3723b5f19468888d specifically

After setting up a venv, installing the requirements.txt, and placing the model into models/Stable-diffusion, I was able to launch with the following command:

STABLE_DIFFUSION_REPO=https://github.com/Stability-AI/stablediffusion  STABLE_DIFFUSION_COMMIT_HASH=33910c386eaba78b7247ce84f313de0f2c314f61 python launch.py --config repositories/stable-diffusion/configs/stable-diffusion/v2-inference-v.yaml

Nov 24 '22 06:11 Penagwin

I get an error about AttributeError: 'FrozenOpenCLIPEmbedder' object has no attribute 'process_text' but it seems to be working anyway, I'm not sure exactly what that's about.

This causes my instance to stop working. How did you get it to proceed?

Edit: Resolved. Remove VRAM constraints

Nov 24 '22 07:11 acheong08

I confirm it works over here as well ! I did not have to use the special launch command though (STABLE_DIFFUSION_REPO=https://github.com/Stability-AI/stablediffusion STABLE_DIFFUSION_COMMIT_HASH=33910c386eaba78b7247ce84f313de0f2c314f61 python launch.py --config repositories/stable-diffusion/configs/stable-diffusion/v2-inference-v.yaml) I simply used the webui-user.bat launcher and it worked the first time after installing all dependencies automatically.

Nov 24 '22 07:11 AugmentedRealityCat

I have got it working on Google Colab. As @Penagwin mentioned, it throws a few errors but still functions.

Note: Tick checkbox for SD1_5 rather than adding it in the Add models section

Nov 24 '22 07:11 acheong08

The way it processes the text seems to be broken AttributeError: 'FrozenOpenCLIPEmbedder' object has no attribute 'process_text'. My generations with the new models look ugly as hell

Nov 24 '22 07:11 acheong08

@AugmentedRealityCat The command is only needed for Linux and MacOs, the .bat should work for windows

@acheong08 Someone else should confirm this, but I believe this error is for getting the token count for displaying in the UI, which is why it's not actually required for generation. If this is right then I don't think it should affect the actual generated image. This is the line that calls the method that errors, and it's inside update_token_counter https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/828438b4a190759807f9054932cae3a8b880ddf1/modules/ui.py#L443

The first several prompts I tried were very... odd. I found 768x768 resolution made a huge difference. I also found starting a prompt from scratch might be a good idea too, just to learn the new prompting language.

I don't know for certain that it's not broken, but I was able to get a few images that I liked. I've had success with the new DPM++ SDE Karras, as well as Euler A. I am finding it a bit more difficult to get good images but I'm unsure if that's because of the v2 changes or if something is broken, or if my prompts are just bad.

Some that I liked:

$00037-3034038969-masterpiece, detailed, dreaming of electric (penguins _4), scifi, concept art, (surreal), galaxy background, sharp, fractals _1$

masterpiece, detailed, dreaming of electric (penguins :4), scifi, concept art, (surreal), galaxy background, sharp,[fractals :1.8], [recursion :1.8] Negative prompt: blurry Steps: 5, Sampler: DPM++ SDE Karras, CFG scale: 8, Seed: 3034038969, Size: 768x768, Model hash: 2c02b20a, Eta: 0.06

00042-670988386-masterpiece, extremely detailed, dreaming of (electric) (penguins _4), scifi, concept art, (surreal), moon, galaxy background,

masterpiece, extremely detailed, dreaming of (electric) (penguins :4), scifi, concept art, (surreal), moon, galaxy background, sharp,[fractals :1.8], [recursion :1.8] Negative prompt: blurry Steps: 5, Sampler: DPM++ SDE Karras, CFG scale: 9, Seed: 670988386, Size: 768x768, Model hash: 2c02b20a, Eta: 0.06

00067-4235446037-masterpiece, extremely detailed, dreaming of (electric) (penguins _2), scifi, digital concept art, (surreal), moon, galaxy bac

masterpiece, extremely detailed, dreaming of (electric) (penguins :2), scifi, digital concept art, (surreal), moon, galaxy background, supernova, dramatic, sharp,[fractals :1.4], [recursion :1.8] Negative prompt: blurry, painting, drawing Steps: 15, Sampler: DPM++ SDE Karras, CFG scale: 13.5, Seed: 4235446037, Size: 768x768, Model hash: 2c02b20a, Eta: 0.06

Nov 24 '22 08:11 Penagwin

On it!

Originally posted by @TheLastBen in https://github.com/TheLastBen/fast-stable-diffusion/issues/599#issuecomment-1326063269

Nov 24 '22 08:11 0xdevalias

@Penagwin It seems it was bad prompting. Their new prompt system messed it up for me. Trying a few times gets me much better results

Nov 24 '22 08:11 acheong08

NSFW has been completely wrecked. It was bad on 1.5 but now it's almost impossible to get anything aesthetic. It feels like Midjourney.

They succeeded at their goal.

Nov 24 '22 08:11 acheong08

NO NSFW?? 1631502011600-790abb1f11c0bd7ac1e89fedf13d112caa25608c05061757d057e6c67196c935

Nov 24 '22 08:11 YakuzaSuske

NO NSFW??

All attempts seem to make it black and white with severely deformed limbs . The samples that make it past their filters seem to be low quality images and abstract art

Nov 24 '22 08:11 acheong08

NO NSFW??

All attempts seem to make it black and white with severely deformed limbs . The samples that make it past their filters seem to be low quality images and abstract art

Try 768x768, since that's what the model's trained for. Doing what you're doing is like telling 1.5 to work at 256x256, it ain't good at resolutions lower than what it was meant for.

Nov 24 '22 08:11 Daviljoe193

At 768x768

Giving up on NSFW

Nov 24 '22 08:11 acheong08

@acheong08 Prompt? Kinda looks like what I got using "woman spread naked, on the beach, fullbody, nude, top-down" in 1.5. download (3) Stock model was never too good at NSFW anyway, too many gross mangled people. It did get more normal looking results after the first attempt with this prompt, though I can't tell if posting NSFW here is against Github TOS, so I'll refrain from posting those.

Nov 24 '22 09:11 Daviljoe193

posting NSFW here is against Github TOS.

It's not porn, it's not even erotica. and not even naturalistic content at all. this is bodyhorror

Nov 24 '22 09:11 imacopypaster

posting NSFW here is against Github TOS.

It's not porn, it's not even erotica. and not even naturalistic content at all. this is bodyhoror

I'll refrain from posting any more here. All my attempts with 2.0 has been horrific.

Prompt?

I just copy pasted 1.5 prompts that got me good results previously. I'll ask around in Discord. GitHub is not meant for such discussions

Nov 24 '22 09:11 acheong08

Do you want to hide your ugly

hands hands2 hands3 hands4

Nov 24 '22 09:11 ClashSAN

Took some fighting to get the 2.0 model to work within the free teir of Colab (Kept ^Cing on me), but after restarting, it had just enough ram free to run the GUI. Running the prompt again, with the same sampler and step count (Steps: 50, Sampler: DPM++ 2M Karras), but with the addition of some naughty bits (Boobs, breasts, vagina, please Github don't kill me), I did get the usual stock results, somewhere about on par with what 1.5 was capable of.

boobs and stuff

grid-0001

To say the results are good would be a complete lie, but again, they are about what you'd expect from the stock 1.5 model. What's a bit weird is that there seems to be some strange alignment issues. When it isn't mangled, it's off center.

Nov 24 '22 09:11 Daviljoe193

Kept ^Cing on me)

@Daviljoe193

How did you solve this? I'm also getting ^C with a paid plan...

Nov 24 '22 10:11 all4five

Kept ^Cing on me)

How did you solve this? I'm also getting ^C with a paid plan...

Restart the session after running everything just up to the ^Cing cell, then re-run that cell again, changing nothing. It's stupid, but that's how it is.

Nov 24 '22 10:11 Daviljoe193

Thanks it worked!

Nov 24 '22 10:11 all4five

NSFW has been completely wrecked. It was bad on 1.5 but now it's almost impossible to get anything aesthetic.

According to the model card they HEAVILY filtered the training data before training the model (threshold of 0.1, where 1.0 is considered fully NSFW), so it's not just a filter tacked on at the end like last time.

https://github.com/Stability-AI/stablediffusion/blob/main/modelcard.md

Training Data The model developers used the following dataset for training the model:

LAION-5B and subsets (details below). The training data is further filtered using LAION's NSFW detector, with a "p_unsafe" score of 0.1 (conservative). For more details, please refer to LAION-5B's NeurIPS 2022 paper and reviewer discussions on the topic.

We currently provide the following checkpoints:

512-base-ema.ckpt: 550k steps at resolution 256x256 on a subset of LAION-5B filtered for explicit pornographic material, using the LAION-NSFW classifier with punsafe=0.1 and an aesthetic score >= 4.5. 850k steps at resolution 512x512 on the same dataset with resolution >= 512x512.

That said, I would assume that that would also mean that anyone who gathered a sufficient training dataset could probably finetune/dreambooth the concept back into the model.

Nov 24 '22 10:11 0xdevalias

they are about what you'd expect from the stock 1.5 model.

I get much better results from 1.5 on average (for nsfw). No deformations with the correct negative prompts

Nov 24 '22 10:11 acheong08

That said, I would assume that that would also mean that anyone who gathered a sufficient training dataset could probably finetune/dreambooth the concept back into the model.

The dataset is already publicly available. The issue is computational power.

Nov 24 '22 10:11 acheong08

this GH issue is like a chat right now lol

Nov 24 '22 10:11 manugarri

this GH issue is like a chat right now lol

Discord for devs

Nov 24 '22 10:11 acheong08

stable-diffusion-webui stable-diffusion-webui copied to clipboard

[Feature Request]: Support for new 2.0 models | 768x768 resolution + new 512x512 + depth + inpainting

Is there an existing issue for this?

What would your feature do ?

Proposed workflow

Links

Additional information

stable-diffusion-webui
stable-diffusion-webui copied to clipboard