stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

Over-exposed parts of generated images at some resolutions with VAE (or TAE) tiling, and upscaler

Open stduhpf opened this issue 11 months ago • 2 comments

I noticed this problem some time ago, But I forgot to make an issue about it. It's very noticable with Tiled TAESD and with upscaler, not as much with tiled VAE.

It happens with tiled VAE when the resolution isn't a multiple of 128 in both dimensions, with tiled TAEsd when the resolution isn't a multiple of 256 in both dimensions, and with the upscaler when the resolution isn't a multiple of 512 in both dimensions. (so 4x the tile size?)

Example:

  • SD1.x, 512x576 resolution : both tiled autoencoders and upscaler are broken
VAE Tiled VAE TAESD Tiled TAESD
VAE VAE Tiling TAE Tiled TAE
VAE + upscaler
RealESRGAN_x4
  • SD1.x, 512x640 resolution : only tiled taesd and upscaler are broken
VAE Tiled VAE TAESD Tiled TAESD
Image Image Image Image
VAE + upscaler
Image
  • SD1.x, 512x768 resolution : only upscaler is broken
VAE Tiled VAE TAESD Tiled TAESD
Image Image Image Image
VAE + upscaler
Image

(I swear it's using the exact same prompt as the other examples)

  • SD1.x, 512x512 resolution: control, everything is fine
VAE Tiled VAE TAESD Tiled TAESD
Image Image Image Image
VAE + upscaler
Image

stduhpf avatar Feb 06 '25 18:02 stduhpf

It is normal since SD 1.5 does not support resolutions that are not multiples of 2, and as long as one side of the image is 512, there is no problem. 512x512, 512x768, and 768x512 are the resolutions it handles without generating a lot of artifacts or aberrations.

FSSRepo avatar Feb 08 '25 15:02 FSSRepo

It's not really about the quality of the generated image, it's something that only happens when using tiling, regardless of the model used. VAE without tiling is working regardless of the resolution.

For example this still happens with sd3.5 medium or flux, even though those models support arbitrary resolutions. I only used SD1.5 (dreamshaper 8 LCM to be exact) because it was faster to generate all the example images with it.

Also the upscaler still has the issue with 512x768 input resolution.

stduhpf avatar Feb 08 '25 16:02 stduhpf