mage icon indicating copy to clipboard operation
mage copied to clipboard

some questions about changing the image classification data set

Open cb-rep opened this issue 1 year ago • 1 comments

If you change the data set, for example now the data set has 47 classes, what else to do but change nb_classes to 47 in main_finetune. Because only modify this final precision is not too high, I am not sure whether the 1000 here vocab_size = codebook_size + 1000 + 1 should be modified, and if modified, it will still report an error: RuntimeError: Error(s) in loading state_dict for VisionTransformerMage: size mismatch for token_emb.word_embeddings.weight: copying a param with shape torch.Size([2025, 768]) from checkpoint, the shape in current model is torch.Size([1072, 768]).

cb-rep avatar Nov 24 '23 13:11 cb-rep

If you plan to finetune the ImageNet pre-trained MAGE on your dataset, you only need to change nb_classes to 47 in main_finetune. The performance can be poor for many reasons -- one reason could be your dataset is too far away from ImageNet image distribution. You could also consider adjusting the training epochs -- if your dataset is much smaller than ImageNet, you should increase the fine-tuning epochs.

LTH14 avatar Nov 24 '23 15:11 LTH14