taming-transformers icon indicating copy to clipboard operation
taming-transformers copied to clipboard

How to sample high-resolution images?

Open ysig opened this issue 3 years ago • 3 comments

Hi,

this paper claims to be able to produce high-resolution images, yet there is no configuration in the code that can learn images bigger that 256x256. Increasing the resolution to 512x512 in the VQGAN leads to Out of Memory errors.

I guess it should then be able to generate images of higher resolutions from smaller generators, but do you accompany the reader with an implementation in your code base, that succeeds this?

Thank you for this wonderful work,

ysig avatar Jun 29 '21 07:06 ysig

The Out of Memory errors are most likely not an issue with the codebase, but rather the system you are running it on. I have not run the code myself, but I would assume if you're system had more memory you would not get these errors.

zbloss avatar Aug 26 '21 17:08 zbloss

You have not understood my question.

Training a GAN with 512x512 leads to OOM which of course depends on my system.

However the official paper claims that it can sample bigger images e.g. 512x512 from smaller models i.e. 256x256 by sliding a window and calculating the transformer output in the region: image

This is the whole selling point of the paper: taming transformers for high-resolution image synthesis: image In this example there was no model that was trained on 1280x460 pixels, but a smaller model has been used with the method I describe above to sample a high-resolution image, which makes it more lightweight as it can train in smaller systems (which is the memory consuming part).

I haven't found any code for this particular type of sampling though in the codebase.

ysig avatar Aug 26 '21 18:08 ysig

https://colab.research.google.com/github/CompVis/taming-transformers/blob/master/scripts/taming-transformers.ipynb#scrollTo=5rVRrUOwbEH0

IceClear avatar Apr 08 '22 03:04 IceClear