imagen-pytorch
imagen-pytorch copied to clipboard
specific value of the Unet parameter
I have some questions about running the Imagen:
1.If I want to get similar performance results as in the article, how should I go about setting the parameters of Unet? The results I get so far are not good.
2.If I only need to get 256*256 images in the end, how many epochs do I need to train for unet1 and unet2 respectively.
3.If I can't download the T5 text encoder directly to the server, I need to download the file locally first. Where should I go to download the T5 Text Encoder and which file should I put it in.
4.The shape of image in the COCO2017 dataset is not uniform, how is the preprocessing done, directly resize to a square number size?