stable-diffusion Checkpoint trained on only 256x256 data?

The README says the v1.1 checkpoint was trained on 256x256 images and then fine-tuned on 512x512 images. Is there any way we can access this 256x256 model as a 1.0 checkpoint? There are various purposes where having a lower-resolution model is and would be more useful. For example, if I want to denoise Imagenet images, then the 256x256 model better matches the size of ImageNet and so might perform better than the 512x512 model.

Aug 22 '22 23:08 carlini

I have the same request. Thanks

Aug 26 '22 14:08 Yuheng-Li

The README says the v1.1 checkpoint was trained on 256x256 images and then fine-tuned on 512x512 images. Is there any way we can access this 256x256 model as a 1.0 checkpoint? There are various purposes where having a lower-resolution model is and would be more useful. For example, if I want to denoise Imagenet images, then the 256x256 model better matches the size of ImageNet and so might perform better than the 512x512 model.

why would you want a shitter model?

Aug 27 '22 01:08 breadbrowser

I have one specific usecase in mind where 256x256 is, in fact, not "shittier": diffusion models can make great denoisers to improve certified adversarial robustness, as long as the noise matches the image size (https://arxiv.org/abs/2206.10550). So the fact that 256x256 is closer to 224x224 makes this model much better.

I suspect it might also be useful for other purposes as well---and if you don't think this model would be useful for you, then just don't use it.

Aug 27 '22 01:08 carlini

@carlini, I found this https://huggingface.co/justinpinkney/miniSD.

Mar 03 '23 00:03 mikeogezi

I found one at here: https://huggingface.co/lambdalabs/sd-image-variations-diffusers, just change the unet with this model and it will work

Apr 07 '23 20:04 pmzzs

I would be also interested in the checkpoint of the model trained only on 256x256 data. It would be nice if you can provide it!

Jun 22 '23 14:06 ksai2324

https://huggingface.co/lambdalabs/miniSD-diffusers

Jun 22 '23 16:06 mikeogezi

@carlini Thanks for the important information. I want reproduce the stable diffusion model. First stage is to train the model on 256x256 data then fine-tune on 512x512 images. Do these two steps have two different autoencoder? Or they are the same autoencoder which seem to be trained on OpenImage dataset. Thanks.

Aug 02 '23 09:08 wtliao

@carlini Thanks for the important information. I want reproduce the stable diffusion model. First stage is to train the model on 256x256 data then fine-tune on 512x512 images. Do these two steps have two different autoencoder? Or they are the same autoencoder which seem to be trained on OpenImage dataset. Thanks.

Hi I have the same query

Feb 16 '24 03:02 jing-yu-lim

stable-diffusion stable-diffusion copied to clipboard

Checkpoint trained on only 256x256 data?

stable-diffusion
stable-diffusion copied to clipboard