diffusers [Community] Move the number "0.18215" from the image2image process to VAE config

[Community] Move the number "0.18215" from the image2image process to VAE config

Open wangyu-ustc opened this issue 2 years ago • 10 comments

There is a magic number "0.18215" in the repository

In the file src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py, there is a number "0.18215" in line 220 and line 342, which is strange since it does occur in the original repo. Is there someone clarifying why is that and where does this number come from?

Oct 04 '22 23:10 wangyu-ustc

It's a constant used to scale the latents so it can be decoded back into a image (src)

# scale and decode the image latents with vae
latents = 1 / 0.18215 * latents
image = vae.decode(latents).sample

Oct 05 '22 04:10 WASasquatch

I think the constant is defined in the model config file from CompVis/stable-diffusion.

Oct 05 '22 04:10 guaneec

There's more explanation about it in #437.

Oct 05 '22 05:10 pcuenca

Let's put it maybe directly in the VAE config then ? cc @patil-suraj

Oct 05 '22 10:10 patrickvonplaten

Maybe this can be a method for a VAE that is overridable? For supporting more complex squashing functions 😉

Oct 05 '22 15:10 neverix

Think we can have this be a config parameter that is overrideable and a Union[int, str] with the string describing a more complex squashing function that can be implemented down the road.

Marking this for now as a community feature as it seems like no one finds the time to open a PR here - in case you're interested @neverix - we'd be more than happy to review a PR :-)

Nov 07 '22 19:11 patrickvonplaten

Should be solved by: https://github.com/huggingface/diffusers/issues/1460

@williamberman could you maybe tackle this?

Dec 01 '22 16:12 patrickvonplaten

Put up draft PR here: https://github.com/huggingface/diffusers/pull/1515 still need to think about a few things before finishing

Dec 01 '22 22:12 williamberman

For reference, here's some code to estimate the magic value: https://github.com/huggingface/diffusers/issues/437#issuecomment-1356945792.

Dec 19 '22 01:12 fepegar

Thanks a lot @fepegar !

Dec 19 '22 23:12 patrickvonplaten

Put up draft PR here: #1515 still need to think about a few things before finishing

For people following this: the new PR is #1860

Jan 16 '23 11:01 hervenivon

https://github.com/huggingface/diffusers/pull/1860 is now merged, closing the issue.

Jan 26 '23 13:01 patil-suraj

diffusers diffusers copied to clipboard

[Community] Move the number "0.18215" from the image2image process to VAE config

There is a magic number "0.18215" in the repository

diffusers
diffusers copied to clipboard