unilm BEiTv2 MIM accuracy

BEiTv2 MIM accuracy

Open ckddls1321 opened this issue 2 years ago • 3 comments

Describe Model I am using BEiTv2 ViT-L/16.

I pre-trained 1K using vqkd tokenizer, it seems MIM(Masked Image Model) accuracy does not go up to 40~50%.

Can you provide log or accuracy reference of pre-training on 1K with tokenizer? Also if you have evaluation results on 1K, can you share us?

Oct 05 '22 02:10 ckddls1321

Hello,

The MIM accuracy is about 16% when using vqkd tokenzier to pretrain ViT-L/16.

When the pretraining schedule is 1600, the accuracy increases slightly:

Oct 06 '22 11:10 pengzhiliang

Thank you! We also have the same trends during training.

During training, Models perform better on every scale to predict masked patches based on visual information. Even it would be hard to predict all scales correctly. The lastest checkpoint performs better.

Oct 21 '22 08:10 ckddls1321

What about the BEiTv2 ViT- B/16？

Nov 25 '22 03:11 zengshao0622

unilm unilm copied to clipboard

BEiTv2 MIM accuracy

unilm
unilm copied to clipboard