stable-diffusion-webui
stable-diffusion-webui copied to clipboard
Added VAE and VAE hash to image generation metadata
This PR adds VAE information to image metadata. The PNG Info tab reads that VAE metadata and, when any of the "Send to" buttons are clicked, will set the user's sd_vae setting to match the VAE from the metadata. #6031
I'm not really sure if I want this functionality in. Overloading infotext with extra information is definitely undesired, and if there's no call from users to have this, adding more code to the repo is also undesired.
Adding more fields to the infotext helps make images more reproducible but clearly it's not possible to include everything so it's up to you which settings make the cut.
I agree that this is PR only mildly useful so I don't mind at all if we just close it.
Personally I'd find this useful, sometimes the VAE I use happens to overcook my image depending on my parameters so it would be nice to have it switch automatically
That could be a useful function for me since there are a few VAEs in my webui. Sometimes a certain VAE is matter for the image.
I think this would be a good addition as well. Every so often I see people trying to replicate other people's images using the metadata but getting confused at differences in the result. These turn out to be caused by using no VAE or a different VAE. I think it would be consistent with other metadata to include it, particularly now that there are a handful of different popular models using interchangeable VAEs.
Still not sure if this will get merged but I'll fix the merge conflict today or tomorrow
Because of historical background, it was customary to change the NAME to the same name as the ckpt, so you might want to record HASH if you can.
~~Seconding @aka7774 - the hash of the VAE would be way more useful~~ it looks like the commit already includes that? As long as they're both toggleable options like storing the model name and/or the model hash like we currently have the option to, I like the idea. e.g. I have the SD 1.5 MSE VAE saved as vae-ft-mse-840000-ema-pruned.vae.pt
exactly as I found it (other than the original .ckpt
) because I'm lazy.
Same goes for the one Waifu Diffusion 1.4 VAE, I downloaded: kl-f8-anime2.vae.pt
- again, because I'm lazy. But someone might have that second one saved as anime2.vae.pt
or Waifu Diffusion kl-f8 anime2.vae.pt
because they're monsters and they put spaces in their filenames, or berrymix.vae.pt
or or or.
The hash would ultimately be the more useful of the two, with the name being just for user readability like with the option to include it for models.
After an X/Y-Plot to test a lot of possible combination of my Models vs. VAEs (and the crash of my computer in the middle ;_; ), I would have really loved to have had this feature.
I think it would be useful and a good idea as long as it is toggleable and maybe disabled by default.
Would love to see this get merged
It looks like the code as changed a bit since I wrote the initial pull request a month ago. It shouldn't be hard to fix, but I don't understand the new code and I don't think it's working properly.
On the master branch, the "Send to" functionality for setting the model checkpoint is broken for me. If I have model A loaded, read the png info of an image created with model B, then click send to txt2img, model B does not get loaded. Since this PR uses similar logic for setting the vae, I can't fix my code until the other "Send to" functionality is fixed.
Is anyone else having the same issue or is it just me? I found a related bug but it's not quite the same: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/7339
This PR is now working properly using the new copy/paste params logic. If you drag-and-drop an image into the txt2img prompt, then click "Read generation parameters", the VAE gets assigned as an override, just like the model hash and other settings.
The "Send To" functionality, however, is not working as expected because the Send To buttons on the PNG Info page haven't been connected properly to the new override settings dropdown. I could create a separate PR to fix that issue, but it shouldn't block this one
I also think this would be useful particularly when experimenting with different VAEs
While I do see this as useful in theory, in practice unlike models, there are only 5 VAE in wide circulation: NovelAi, WD-Anime1, WD-Anime2, SD-EMA, SD-MSE
If that situation doesn't change, it would be arguably more user friendly to identify and tag abbreviated names of those VAE rather than use a hash. Since the stable-diffusion-webui-model-toolkit extension is able to detect those known VAE even when pruned, converted to FP16, or embedded into a model, it should be possible for WebUI proper to store such info on Model/VAE load for metadata purposes. This may go a bit beyond the scope of this pull request though.
I think it will be hard to tell if the situation changes in say a year. Maybe we will get even more VAEs with better performance? Things are moving pretty quickly in this space, I'd prefer the most future-proof option
This brings a question to my mind of why in the past 6 months more custom VAE haven't been trained and released publicly in the first place. It feels like 90%+ of custom models just use the NovelAi VAE, which admittingly is quite good aside from it's tendency to produce NaNs with FP16 Bias (but not BF16 Bias).
A bit off-topic, but is anyone aware of added complications or complexities regarding the training and creation of VAE and why only big organizations seem to be creating them? Does VAE training require a specialized dataset or tools?
It really feels like we've been in need of a specialized VAE for use with img2img. One which is more conservative in terms of levels/contrast enhancements, and designed to resist darkening beyond the source img as well as baking/scorching. All the existing VAE seem more finetuned to text2img and aim for high contrast output. I hope in 2023 we do start seeing more custom VAE appear.