CogView2 How many tokens should be trained when pretraining the text-to-image generation in the paper?

How many tokens should be trained when pretraining the text-to-image generation in the paper?

Open runzeer opened this issue 3 years ago • 4 comments

Aug 03 '22 09:08 runzeer

Do you mean the max-sequence-length? we use 512 length (400 image tokens + up to 112 text tokens). I think the other hyperparams can be found in the paper, except for learning rate, which I need to check the codes after work, but I think any values around 1e-4 with warmup is okay.

Aug 03 '22 17:08 Sleepychord

I mean the training hyperparameters such like training steps, the training tokens per batch. In the CogView paper, it gave us the relevant parameters.

Aug 04 '22 00:08 runzeer

Hi, please refer to section 3.2 of CogView2 paper~

Aug 04 '22 01:08 Sleepychord

Thanks a lot! Also, I had a question. when the model can generate the visible image, the loss could be what level. I mean could you give me some training logs so that I can do the experiments as a reference?

Aug 05 '22 10:08 runzeer

CogView2 CogView2 copied to clipboard

How many tokens should be trained when pretraining the text-to-image generation in the paper?

CogView2
CogView2 copied to clipboard