Phil Wang
Phil Wang
@rom1504 it works! :pray: https://github.com/lucidrains/DALLE2-pytorch/releases/tag/0.0.73
@TheoCoombes so one thing to note is that in the paper, they actually sampled a couple image embeddings (well, just 2 i guess), and then selected the one with the...
@rom1504 very nice! :D
https://github.com/lucidrains/DALLE2-pytorch/issues/23#issuecomment-1127011855 we can share the preprint once it gets released on arxiv
@Veldrovive it is safe to just go with the same hyperparameters as what Imagen has, as Imagen outperforms DALLE2 anyways. we know at the very least that scaling the unets...
they are also using the BSR degradation used by Rombach et al https://github.com/CompVis/latent-diffusion/tree/e66308c7f2e64cb581c6d27ab6fbeb846828253b/ldm/modules/image_degradation https://github.com/cszn/BSRGAN/blob/main/utils/utils_blindsr.py that I don't have in the repository yet tempted to just go with Imagen's noising procedure...
ok, `0.11.0` should allow for the different noise schedules across different unets, as in the paper after adding the BSR image degradation (or some alternative), i think i'm comfortable giving...
> I understand only the image (and clip image EMB) is needed and no text ? @rom1504 yup, no text conditioning needed, i think it should all be in the...
@rom1504 what is the `.not` file extension?
the generator is actually interesting because it consists of multiple networks (cascading unets), and they can be trained separately (from the base network all the way to the super-resoluting one...