stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

Getting always black image using XL models

Open FrankEscobar opened this issue 1 year ago • 2 comments

I've tried with sd_xl_base_1.0.safetensors and sd_xl_turbo_1.0_fp16.safetensors, using CPU and GPU, trying f32, f16...

But I always get black images even using as simple command lines as ./sd.exe --model sd_xl_turbo_1.0_fp16.safetensors --prompt "a dog" --width 512 --height 512 --steps 1 --output output0.png

I have no issues with other models like 1.5 or 2.1, thank you!

FrankEscobar avatar Feb 01 '24 17:02 FrankEscobar

download this vae sdxl-vae-fix, and add the argument --vae sdxl_vae.safetensors

FSSRepo avatar Feb 01 '24 19:02 FSSRepo

Great thank you!

to be honest I'm a bit lost about the VAE, emaonly, noema... when you should use one or another etc.

And if you search for that in google most of the things are just some confusing discussions on Reddit.

Btw I expected a quality gain using XL but the results are pretty unrealistic, I tried with a Lora too but it didn't work for me (it works for Loras and 1.5/2.1)

This is the shown warning:

STD OUT [INFO ] stable-diffusion.cpp:141  - loading model from 'sd_xl_base_1.0.safetensors'
[INFO ] model.cpp:645  - load sd_xl_base_1.0.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:152  - loading vae from 'sdxl_vae.safetensors'
[INFO ] model.cpp:645  - load sdxl_vae.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:169  - Stable Diffusion XL 
[INFO ] stable-diffusion.cpp:175  - Stable Diffusion weight type: f16
[INFO ] stable-diffusion.cpp:276  - total memory buffer size = 6570.67MB (clip 1568.77MB, unet 4903.43MB, vae 98.47MB)
[INFO ] stable-diffusion.cpp:278  - loading model from 'sd_xl_base_1.0.safetensors' completed, taking 6.21s
[INFO ] stable-diffusion.cpp:292  - running in eps-prediction mode
[INFO ] model.cpp:645  - load Dressed animals XL.safetensors using safetensors format
[INFO ] lora.hpp:35   - loading LoRA from 'Dressed animals XL.safetensors'
[INFO ] stable-diffusion.cpp:405  - lora 'Dressed animals XL' applied, taking 1.15s
[INFO ] stable-diffusion.cpp:1233 - apply_loras completed, taking 1.15s
[INFO ] stable-diffusion.cpp:1272 - get_learned_condition completed, taking 389 ms
[INFO ] stable-diffusion.cpp:1288 - sampling using modified DPM++ (2M) method
[INFO ] stable-diffusion.cpp:1292 - generating image: 1/1 - seed 870920193
  |==================================================| 50/50 - 1.93it/s
[INFO ] stable-diffusion.cpp:1304 - sampling completed, taking 26.97s
[INFO ] stable-diffusion.cpp:1312 - generating 1 latent images completed, taking 27.03s
[INFO ] stable-diffusion.cpp:1314 - decoding 1 latents
[INFO ] stable-diffusion.cpp:1324 - latent 1 decoded, taking 0.68s
[INFO ] stable-diffusion.cpp:1328 - decode_first_stage completed, taking 0.68s
[INFO ] stable-diffusion.cpp:1347 - txt2img completed in 28.10s

STD ERROR ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4070, compute capability 8.9, VMM: yes
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc1.alpha
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc1.lora_down.weight
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc1.lora_up.weight
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc2.alpha
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc2.lora_down.weight
[WARN ] lora.hpp:154  - unused lora tensor lora.te1_text_model_encoder_layers_0_mlp_fc2.lora_up.weight
....

Regards.

FrankEscobar avatar Feb 02 '24 19:02 FrankEscobar