stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

using the VAE

Open HarvieKrumpet opened this issue 1 year ago • 3 comments

I have everything working and integrated with your lib. happily using all the features. But I am stuck at the VAE. When I include the vae tensors. It runs, no errors, but the image is distorted. I have tried every which way but Sunday to use different vae's and go between the sd versions. Always ends up the same. Possibly I am supposed to use some specific VAE with the SD version? or need to set some settings specifically? Or maybe the VAE requires a specific resolution to be set?

You could clarify for me which VAE file goes with which SD base file. So I don't have to guess what to do. I did not choose to learn with automatic or webui since they are python based. so possibly I am at a disadvantage to know the obvious.

Also does the image2image work? I see the code looks like its there. but no example how to feed it an image correctly

HarvieKrumpet avatar May 26 '24 18:05 HarvieKrumpet

It would help if you posted the exact command you're trying to use to run it.

img2img works fine, just set the "--mode" switch to "img2img" and provide an input file with "--init-img", you can then adjust the de-noising strength with "--strength".

grauho avatar May 27 '24 10:05 grauho

what is the correct files and sample settings to get vae to work? I am directly compiling and linking into your lib in c# so no command line switches.

Which vae tensor should I use? and is it specific to sd,sdxl,sd2.1,sd1.4? and this issue with fp16?, should I be defaulting to fp32 for vae instead of fp16? There is no error, but the output is slightly disorted image when I include the vae? am I supposed to set Vae decode only? use the decoder vae's only. there's three. vae,vae decoder, and vae encoder tensor's. I did get the img2img to work.

thanks,

HarvieKrumpet avatar May 27 '24 17:05 HarvieKrumpet

In that case I would recommend taking a look at /examples/main.cpp if you want to see how the API is set up and called, and having the cli program compiled to test the same settings and prompt against your C# program would probably be helpful.

I believe that VAEs are usually tied to a base model, eg: a SD1.5 VAE should function with any model based on SD1.5. That said there are exceptions like with ponyXL that has it's own VAE that works better with it.

If you're using an SDXL model there is a known issue that requires a VAE fix when running in half-precision float mode, FP16, that the program does warn you about on the command line. See: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix

grauho avatar May 27 '24 19:05 grauho