stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

Running img2img failed

Open rayrayraykk opened this issue 2 years ago • 2 comments

Thanks for your great work. The text2image mode works fine, but I met an error when using image2image mode. Any suggestions?

(mlc)- stable-diffusion.cpp % ./cmake-build-debug/bin/sd --mode img2img -m models/stable-diffusion-nano-2-1-ggml-model-q8_0.bin -p "Cat" -i ./nano_cat_q8_0.png -o ./img2img_output_v21_1.png --strength 0.4
[INFO]  stable-diffusion.cpp:2830 - loading model from 'models/stable-diffusion-nano-2-1-ggml-model-q8_0.bin'
[INFO]  stable-diffusion.cpp:2858 - model type: SD2.x
[INFO]  stable-diffusion.cpp:2866 - ftype: q8_0
[WARN]  stable-diffusion.cpp:3028 - unknown tensor 'cond_stage_model.model.transformer.text_model.embeddings.position_ids' in model file
[INFO]  stable-diffusion.cpp:3094 - total params size = 1923.94MB (clip 358.69MB, unet 1405.49MB, vae 159.76MB)
[INFO]  stable-diffusion.cpp:3096 - loading model from 'models/stable-diffusion-nano-2-1-ggml-model-q8_0.bin' completed, taking 0.86s
[INFO]  stable-diffusion.cpp:3244 - check is_using_v_parameterization_for_sd2 completed, taking 0.99s
[INFO]  stable-diffusion.cpp:3121 - running in eps-prediction mode
[INFO]  stable-diffusion.cpp:4296 - img2img 128x128
[INFO]  stable-diffusion.cpp:4300 - target t_enc is 8 steps
Assertion failed: (sizeof(dst->nb[0]) == sizeof(float)), function asymmetric_pad, file stable-diffusion.cpp, line 1407.
zsh: abort      ./cmake-build-debug/bin/sd --mode img2img -m  -p "Cat" -i ./nano_cat_q8_0.png

~~The input image is generated by nano-SD2.1 with 128*128 resolution.~~

I try the example provided, and the same error goes.

[INFO]  stable-diffusion.cpp:2830 - loading model from './models/sd-v1-4-ggml-model-f16.bin'
[INFO]  stable-diffusion.cpp:2858 - model type: SD1.x
[INFO]  stable-diffusion.cpp:2866 - ftype: f16
[INFO]  stable-diffusion.cpp:3094 - total params size = 2035.23MB (clip 235.01MB, unet 1640.46MB, vae 159.76MB)
[INFO]  stable-diffusion.cpp:3096 - loading model from './models/sd-v1-4-ggml-model-f16.bin' completed, taking 1.75s
[INFO]  stable-diffusion.cpp:3121 - running in eps-prediction mode
[INFO]  stable-diffusion.cpp:4296 - img2img 512x512
[INFO]  stable-diffusion.cpp:4300 - target t_enc is 0 steps
Assertion failed: (sizeof(dst->nb[0]) == sizeof(float)), function asymmetric_pad, file stable-diffusion.cpp, line 1407.
zsh: abort      ./cmake-build-debug/bin/sd --mode img2img -m  -p "cat with blue eyes" -i  -o 

Edit: By temporarily removing the assertion below in lines 1407-1409, it works fine:

//        assert(sizeof(dst->nb[0]) == sizeof(float));
//        assert(sizeof(a->nb[0]) == sizeof(float));
//        assert(sizeof(b->nb[0]) == sizeof(float));

Also, I found that img2img can't change the resolution of the image. Can we pad the input image to change the output to target resolution?

rayrayraykk avatar Sep 21 '23 03:09 rayrayraykk

It's a bit weird. I used the same model and parameters, but I couldn't reproduce your issue. Could you please provide me with information about your environment? This should include your CPU architecture, operating system, and the compilation parameters for sd.cpp. Or you can use the '-v' flag when executing the command to get detailed output.

leejet avatar Oct 22 '23 05:10 leejet

It's a bit weird. I used the same model and parameters, but I couldn't reproduce your issue. Could you please provide me with information about your environment? This should include your CPU architecture, operating system, and the compilation parameters for sd.cpp. Or you can use the '-v' flag when executing the command to get detailed output.

Thanks for your reply. This happens when setting CMAKE_BUILD_TYPE to Debug.

rayrayraykk avatar Oct 23 '23 04:10 rayrayraykk