stable-diffusion.cpp
                                
                                 stable-diffusion.cpp copied to clipboard
                                
                                    stable-diffusion.cpp copied to clipboard
                            
                            
                            
                        Running img2img failed
Thanks for your great work. The text2image mode works fine, but I met an error when using image2image mode. Any suggestions?
(mlc)- stable-diffusion.cpp % ./cmake-build-debug/bin/sd --mode img2img -m models/stable-diffusion-nano-2-1-ggml-model-q8_0.bin -p "Cat" -i ./nano_cat_q8_0.png -o ./img2img_output_v21_1.png --strength 0.4
[INFO]  stable-diffusion.cpp:2830 - loading model from 'models/stable-diffusion-nano-2-1-ggml-model-q8_0.bin'
[INFO]  stable-diffusion.cpp:2858 - model type: SD2.x
[INFO]  stable-diffusion.cpp:2866 - ftype: q8_0
[WARN]  stable-diffusion.cpp:3028 - unknown tensor 'cond_stage_model.model.transformer.text_model.embeddings.position_ids' in model file
[INFO]  stable-diffusion.cpp:3094 - total params size = 1923.94MB (clip 358.69MB, unet 1405.49MB, vae 159.76MB)
[INFO]  stable-diffusion.cpp:3096 - loading model from 'models/stable-diffusion-nano-2-1-ggml-model-q8_0.bin' completed, taking 0.86s
[INFO]  stable-diffusion.cpp:3244 - check is_using_v_parameterization_for_sd2 completed, taking 0.99s
[INFO]  stable-diffusion.cpp:3121 - running in eps-prediction mode
[INFO]  stable-diffusion.cpp:4296 - img2img 128x128
[INFO]  stable-diffusion.cpp:4300 - target t_enc is 8 steps
Assertion failed: (sizeof(dst->nb[0]) == sizeof(float)), function asymmetric_pad, file stable-diffusion.cpp, line 1407.
zsh: abort      ./cmake-build-debug/bin/sd --mode img2img -m  -p "Cat" -i ./nano_cat_q8_0.png
~~The input image is generated by nano-SD2.1 with 128*128 resolution.~~
I try the example provided, and the same error goes.
[INFO]  stable-diffusion.cpp:2830 - loading model from './models/sd-v1-4-ggml-model-f16.bin'
[INFO]  stable-diffusion.cpp:2858 - model type: SD1.x
[INFO]  stable-diffusion.cpp:2866 - ftype: f16
[INFO]  stable-diffusion.cpp:3094 - total params size = 2035.23MB (clip 235.01MB, unet 1640.46MB, vae 159.76MB)
[INFO]  stable-diffusion.cpp:3096 - loading model from './models/sd-v1-4-ggml-model-f16.bin' completed, taking 1.75s
[INFO]  stable-diffusion.cpp:3121 - running in eps-prediction mode
[INFO]  stable-diffusion.cpp:4296 - img2img 512x512
[INFO]  stable-diffusion.cpp:4300 - target t_enc is 0 steps
Assertion failed: (sizeof(dst->nb[0]) == sizeof(float)), function asymmetric_pad, file stable-diffusion.cpp, line 1407.
zsh: abort      ./cmake-build-debug/bin/sd --mode img2img -m  -p "cat with blue eyes" -i  -o 
Edit: By temporarily removing the assertion below in lines 1407-1409, it works fine:
//        assert(sizeof(dst->nb[0]) == sizeof(float));
//        assert(sizeof(a->nb[0]) == sizeof(float));
//        assert(sizeof(b->nb[0]) == sizeof(float));
Also, I found that img2img can't change the resolution of the image. Can we pad the input image to change the output to target resolution?
It's a bit weird. I used the same model and parameters, but I couldn't reproduce your issue. Could you please provide me with information about your environment? This should include your CPU architecture, operating system, and the compilation parameters for sd.cpp. Or you can use the '-v' flag when executing the command to get detailed output.
It's a bit weird. I used the same model and parameters, but I couldn't reproduce your issue. Could you please provide me with information about your environment? This should include your CPU architecture, operating system, and the compilation parameters for sd.cpp. Or you can use the '-v' flag when executing the command to get detailed output.
Thanks for your reply. This happens when setting CMAKE_BUILD_TYPE to Debug.