DALLE2-pytorch Replication of the upscalers

Hey, so we got decent versions of the prior and the basic decoder now.

I think the current code is already able to train upscalers but we need more doc for it.

Let's have a upscaler.md explaining

What is it
How to prepare the dataset
what hyper parameters
command to run the training
expected GPU hours cost

And then train it!

We can also discuss what's the right dataset, but I figure the laion5B subset we call "laion high resolution" could do the trick (it's 170M images in 1024x1024 or bigger)

I understand only the image (and clip image EMB) is needed and no text ?

Jun 19 '22 19:06 rom1504

Here's some relevant sections of the paper for reference while in this thread

Jun 19 '22 20:06 nousr

they are also using the BSR degradation used by Rombach et al https://github.com/CompVis/latent-diffusion/tree/e66308c7f2e64cb581c6d27ab6fbeb846828253b/ldm/modules/image_degradation https://github.com/cszn/BSRGAN/blob/main/utils/utils_blindsr.py that I don't have in the repository yet

tempted to just go with Imagen's noising procedure (on top of the blur) and call it a day (it would be a lot simpler)

Jun 20 '22 15:06 lucidrains

ok, 0.11.0 should allow for the different noise schedules across different unets, as in the paper

after adding the BSR image degradation (or some alternative), i think i'm comfortable giving the repository a 1.0

Jun 20 '22 16:06 lucidrains

I understand only the image (and clip image EMB) is needed and no text ?

@rom1504 yup, no text conditioning needed, i think it should all be in the image embedding!

Jun 20 '22 16:06 lucidrains

Hi all, I am aiming to train the decoder and upsampler. Because the decoder and upsampler have too many parameters, so I decide to train them seperately. I saw in the readme which says the upsampler and the decoder net can be trained seperately. I viewed the code, in my understanding, although I can train them seperately, I need to load the parameters of both unet 0 and unet 1 and change the unet number into 1 to train only unet 1. I don't know if I am right. If so, I couldn't train unet0 and unet 1 in two seperate machines. I am wondering how I could train the decoder net and upsamplers seperately? Best,

Jun 26 '22 05:06 YUHANG-Ma

DALLE2-pytorch DALLE2-pytorch copied to clipboard

Replication of the upscalers

DALLE2-pytorch
DALLE2-pytorch copied to clipboard