Robin Rombach

Results 15 comments of Robin Rombach

Hi, thanks for checking out our code! What you describe is most likely triggered by another error that occurs during the initialization of the script. Please check the full stack...

Which version of pytorch-lightning are you using? This code still uses `pl=0.9` and is not compatible with lightning-versions >= 1.0. Additionally, you can try to set the `save_top_key` = 0,...

Thanks! The VQGAN benefits greatly from training it as long as possible (provided the data set is large enough and overfitting is a secondary concern), and tuning in the discriminator...

Hi, thanks for your interest in our work. We just released scripts for text-to-image and class-conditional synthesis (and corresponding checkpoints) with #27.

Hey @zhihongp, thanks for catching this! I have just added the VQGAN loss in f13bf9bf463d95b5a16aeadd2b02abde31f769f8. It is the same as in the taming-transformers repo, but provides some additional information about...

Hi, sorry for the late reply. I will take a guess and suggest to run the training with the following command: `python main.py --base configs/latent-diffusion/lsun_churches-ldm-kl-8.yaml -t --gpus --scale_lr False` This...

For the different conditioning tasks (semantic synthesis, depth-to-image etc) we train different transformer models. The VQGAN on ImageNet should be fairly general and we re-use it across some tasks, but...

Hi. Yes, one way is to store already calculated attention weights when creating a sequence. See for example https://huggingface.co/transformers/quickstart.html#using-the-past. Note that this is not currently implemented for our models as...

Hi, which embeddings are you referring to exactly? Do you mean "internal" transformer representations?