stable-diffusion-webui Interesting idea worth implementing: use CLIP guidance to enhance the quality and coherency of images

trafficstars

Is your feature request related to a problem? Please describe. sometimes bigger images are not coherent

Describe the solution you'd like See idea behind thi post https://www.reddit.com/r/StableDiffusion/comments/y4fekg/dreamstudio_will_now_use_clip_guidance_to_enhance/

Oct 15 '22 14:10 Centurion-Rome

It's not about larger images. You can fix the larger images by enabling the high resolution checkbox.

CLIP guidance is a slower process that uses CLIP every frame and is more about helping it follow the prompt in more detail than maintaining coherence at large sizes. CLIP guidance gets stable diffusion a lot closer to Dall-E 2 in terms of correctly understanding prompts (which isn't perfect, but it's better).

Oct 15 '22 16:10 lendrick

Someone already implemented it apparently

https://github.com/Birch-san/stable-diffusion/compare/34556bc45211e0a1d3554ee0cd6795793889fbfb...1f13594371a31c417c8ce5f296fead63e43be337

https://twitter.com/Birchlabs/status/1578141960249876482

Oct 15 '22 21:10 kybercore

So how do you integrate this with the interface code? The filenames don't line up quite so I cannot simply copy and paste the code changes mentioned.

Oct 24 '22 14:10 ASilver

Was CLIP Guidance ever implemented into Automatic1111?

Nov 13 '22 15:11 slymeasy

Worth noting that implementing native CLIP Guidance would allow for dramatic improvements to outpainting, see https://www.reddit.com/r/StableDiffusion/comments/ysv5lk/outpainting_mk3_demo_gallery/

Nov 15 '22 16:11 aaronsantiago

Any new developments on this by chance?

Dec 16 '22 22:12 tchesket

Hi I implemented the code from Birch-san's repository into webui, I dont know anything about the underlying math but it seems to work okay

https://github.com/space-nuko/stable-diffusion-webui/tree/feature/clip-guidance

Note that even for a single image it is very slow and some people recommend >50 steps for best results. Also note that this implementation only works when batch_size=1

Also for the record, I think this would be very difficult to be made into an extension since it requires modifications to how the stable diffusion samplers work

Some examples I made w/Euler a sampling, 50 steps

No CLIP guidance: 42069-588878215-photograph, realistic, hyper detail, masterpiece, best quality, painting, rendered in blender, 4k, ((tracing)), (subsurface scat

ViT-B-16-plus-240, pretrained=laion400m_e32, CLIP guidance scale=200: 42070-588878215-photograph, realistic, hyper detail, masterpiece, best quality, painting, rendered in blender, 4k, ((tracing)), (subsurface scat

roberta-ViT-B-32, pretrained=laion2b_s12b_b32k, CLIP guidance scale=250: 42075-588878215-photograph, realistic, hyper detail, masterpiece, best quality, painting, rendered in blender, 4k, ((tracing)), (subsurface scat

ViT-B-32, pretrained=laion2b_s34b_b79k, CLIP guidance scale=200: 42071-588878215-photograph, realistic, hyper detail, masterpiece, best quality, painting, rendered in blender, 4k, ((tracing)), (subsurface scat

ViT-B-32, pretrained=laion2b_s34b_b79k, CLIP guidance scale=300: 42072-588878215-photograph, realistic, hyper detail, masterpiece, best quality, painting, rendered in blender, 4k, ((tracing)), (subsurface scat

ViT-B-32, pretrained=laion2b_s34b_b79k, CLIP guidance scale=400: 42073-588878215-photograph, realistic, hyper detail, masterpiece, best quality, painting, rendered in blender, 4k, ((tracing)), (subsurface scat

Dec 20 '22 23:12 space-nuko

Hi I implemented the code from Birch-san's repository into webui, I dont know anything about the underlying math but it seems to work okay

https://github.com/space-nuko/stable-diffusion-webui/tree/feature/clip-guidance

Note that even for a single image it is very slow and some people recommend >50 steps for best results. Also note that this implementation only works when batch_size=1

Also for the record, I think this would be very difficult to be made into an extension since it requires modifications to how the stable diffusion samplers work

Some examples I made w/Euler a sampling, 50 steps

I installed it but am getting constant CUDA out of memory issues. I reduced the CLIP Guidance to 50, just in case, but it made no difference. Ex:

RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 8.00 GiB total capacity; 7.22 GiB already allocated; 0 bytes free; 7.28 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

This is using an RTX 2080 Super with 8GB VRAM.

Dec 21 '22 18:12 ASilver

Yeah I think the VRAM requirements are just really high, I dont remember it taking less than 16GB for me with xformers enabled

Part of the reason is I had to turn off checkpointing for it to work, thats a feature that saves VRAM but cant be used with some torch features apparently (torch.grad.autograd() in this case). I dont know if it just has to be implemented like that or if theres another way that an actual ML whiz could figure out

Dec 22 '22 02:12 space-nuko

Do we have ability to run this into 8gb vram?

Dec 23 '22 04:12 Nyaster

Is this available as an extension or is it a full fork?

Dec 24 '22 05:12 azureprophet

Its a fork for now, I had to make some changes to the original code to get it to work correctly, also Im still trying to figure out how to improve the performance

Dec 30 '22 18:12 space-nuko

As soon as you have a possible way to work with 8GB VRAM, drop a note here and I will gladly help test.

On Fri, Dec 30, 2022 at 3:44 PM space-nuko @.***> wrote:

Its a fork for now, I had to make some changes to the original code to get it to work correctly, also Im still trying to figure out how to improve the performance

— Reply to this email directly, view it on GitHub https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/2738#issuecomment-1368048253, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA3MNV4XJ6RVPBVCNMEONKDWP4UPXANCNFSM6AAAAAARF6FQ6Q . You are receiving this because you commented.Message ID: @.***>

Dec 30 '22 19:12 ASilver

I hope this can be installed as an extension, so we don't use a fork - or implemented directly in A1111.

Jun 07 '23 12:06 andupotorac

stable-diffusion-webui stable-diffusion-webui copied to clipboard

Interesting idea worth implementing: use CLIP guidance to enhance the quality and coherency of images

stable-diffusion-webui
stable-diffusion-webui copied to clipboard