AssertionError: You do not have CLIP state dict! and blue screen if using both text_encoder + VAE, also blurry images

Open Undpanzer opened this issue 8 months ago • 1 comments

I was trying to create an image with a certain checkpoint (It's my first time doing this so I'm not sure if I can link it here since it's NSFW) and I ended up getting this error

I asked the guy from the video I followed on youtube what to do and he sent me a github link that told me to download both of these

https://huggingface.co/lllyasviel/flux_text_encoders/tree/main https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main

Deepseek told me fp16 would be better for my case since I have a weak 6gb vram card (1660 super) and 16gb ram

After installing both the Text encoder and VAE I tried first generating with just the Encoder but it didn't work, then when I tried running both as someone on that forum said my computer immediately got a blue screen with a out of memory error

Also worth mentioning I've barely able to generate anything with ForgeUI, it always creates something blurry or kinda hazy, the sole exception was when I tried doing something with anime and it worked normally

These are my checkpoints, the first one created a total blur for me when I tried using it, middle one is the one I'm currently having problems and the last one worked for anime but when I asked it to do a simple cat it left it all hazy but that was a few days ago so I don't remember the configs I used for it

Suffice to say I'm a complete newbie to this stuff, I also used ComfyUI and it was alright but it was taking way too long to do images and the UI was kinda complicated so I decided to go back to ForgeUI

Mar 30 '25 18:03 Undpanzer

Your don't have enough VRAM/System RAM to run the FP16 Flux Dev model efficiently. My suggestion is to get Flux Schnell here and download any of the Q models. Start with Q8_0 and see how it goes. Schnell is a distilled version of Flux Dev that is meant to be faster. The quality is not as good as Dev, but in your case, speed will be more important.

You can grab the GGUF version of the T5 encoder here. Same story: start with Q8_0 and see if its fast enough.

You'll use the same CLIP L and Flux VAE files from the links you had before.

Now, make sure in the UI you've selected Flux on the top left. Ensure you've chosen the correct Flux model, and that the CLIP, T5 XXL, and VAE models are selected in the text encoder list. Then adjust the GPU Weights slider to ~5GB for your 6GB of VRAM:

This discussion here has more information on selecting flux models and other tweaks you can make: https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050

Apr 05 '25 05:04 MisterChief95