i-Code icon indicating copy to clipboard operation
i-Code copied to clipboard

Is the pretrained mae encoder weights available ?

Open CheungZeeCn opened this issue 2 years ago • 2 comments

in config:

"mae_checkpoint": "mae_models/mae_pretrain_vit_large_full.pth"

in udop_dual:

self.vision_encoder = mae_model(config.mae_version, config.mae_checkpoint, config.image_size, config.vocab_size, 
config.max_2d_position_embeddings)

But I found no pretiraned weights for mae encoder. Is the pretrained mae encoder weights available now?

Thank you!

CheungZeeCn avatar Aug 01 '23 09:08 CheungZeeCn

The MAE checkpoint is together with the transformer weights included in the checkpoint. if you want the original MAE weights you can download it from the original MAE codebase.

zinengtang avatar Aug 03 '23 21:08 zinengtang

In the transformer weights, for mae, there are only weights for patch_embed and special_vis_token (and pos_embed), but not the blocks. And in the forward method, you indeed only use patch_embed to encode the patches.

Do you not use the full mae like in udop_dual? This simple projection carries all the information for font, line spacing, color etc etc?

znb899 avatar Aug 31 '23 15:08 znb899