Tianhong Li

Results 65 comments of Tianhong Li

It can be of any value other than the tokenzier codebook indices range (0-1023).

A loss of 2.6 is already quite low if you use the default masking strategy: in our experiment, the training loss converges around 5.7-5.8.

We follow [BERT](https://github.com/google-research/bert/blob/master/modeling.py#L126) for this design. I haven't tried using logits directly from an fc layer.

Unfortunately, I don't have a QQ. You can post your problem here and I'll try my best to help you.

I think you are copying a ViT-Large pre-trained model to a ViT-Base model. Try to set --model mage_vit_large_patch16 in your gen_img_uncond.py argument.