Swin-Transformer
Swin-Transformer copied to clipboard
Issue trying to use MoBY SLL pretrained model with SWIN T backbone
When trying to load the checkpoint after SLL pretraining (1 epoch to test) with MoBY I get this error after using the --pretrained
flag pointing to the checkpoint (I've tried ckpt_epoch_0.pth and checkpoint.pth). I am trying to transfer self-supervised learning with the same backbone to this architecture.
[2022-03-11 20:31:06 swin_tiny_patch4_window7_224](utils.py 47): INFO ==============> Loading weight **********/MOBY SSL SWIN/moby__swin_tiny__patch4_window7_224__odpr02_tdpr0_cm099_ct02_queue4096_proj2_pred2/default/ckpt_epoch_0.pth for fine-tuning......
Traceback (most recent call last):
File "/content/Swin-Transformer/main.py", line 357, in <module>
main(config)
File "/content/Swin-Transformer/main.py", line 131, in main
load_pretrained(config, model_without_ddp, logger)
File "/content/Swin-Transformer/utils.py", line 70, in load_pretrained
relative_position_bias_table_current = model.state_dict()[k]
KeyError: 'encoder.layers.0.blocks.0.attn.relative_position_bias_table'
Killing subprocess 3303```
Was able to load a checkpoint somewhat by changing the loading code:
- adding the lines
def load_pretrained(config, model, logger):
logger.info(f"==============> Loading weight {config.MODEL.PRETRAINED} for fine-tuning......")
checkpoint = torch.load(config.MODEL.PRETRAINED, map_location='cpu')
state_dict = checkpoint['model']
> if sorted(list(state_dict.keys()))[0].startswith('encoder'):
> state_dict = {k.replace('encoder.', ''): v for k, v in state_dict.items() if k.startswith('encoder.')}
- Commenting out the code to do with head bias - allowing it to reset them with the code in the else
# check classifier, if not match, then re-init classifier to zero
# head_bias_pretrained = state_dict['head.bias']
# Nc1 = head_bias_pretrained.shape[0]
# Nc2 = model.head.bias.shape[0]
# if (Nc1 != Nc2):
# if Nc1 == 21841 and Nc2 == 1000:
# logger.info("loading ImageNet-22K weight to ImageNet-1K ......")
# map22kto1k_path = f'data/map22kto1k.txt'
# with open(map22kto1k_path) as f:
# map22kto1k = f.readlines()
# map22kto1k = [int(id22k.strip()) for id22k in map22kto1k]
# state_dict['head.weight'] = state_dict['head.weight'][map22kto1k, :]
# state_dict['head.bias'] = state_dict['head.bias'][map22kto1k]
# else:
torch.nn.init.constant_(model.head.bias, 0.)
torch.nn.init.constant_(model.head.weight, 0.)
# del state_dict['head.weight']
# del state_dict['head.bias']
logger.warning(f"Error in loading classifier head, re-init classifier head to 0")
msg = model.load_state_dict(state_dict, strict=False)
logger.warning(msg)
logger.info(f"=> loaded successfully '{config.MODEL.PRETRAINED}'")
del checkpoint
torch.cuda.empty_cache()
After these changes it loads but is missing keys find the log below:
_IncompatibleKeys(missing_keys=['layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn_mask', 'layers.0.blocks.1.attn.relative_position_index',
'layers.1.blocks.0.attn.relative_position_index', 'layers.1.blocks.1.attn_mask', 'layers.1.blocks.1.attn.relative_position_index',
'layers.2.blocks.0.attn.relative_position_index', 'layers.2.blocks.1.attn_mask', 'layers.2.blocks.1.attn.relative_position_index',
'layers.2.blocks.2.attn.relative_position_index', 'layers.2.blocks.3.attn_mask', 'layers.2.blocks.3.attn.relative_position_index',
'layers.2.blocks.4.attn.relative_position_index', 'layers.2.blocks.5.attn_mask', 'layers.2.blocks.5.attn.relative_position_index',
'layers.3.blocks.0.attn.relative_position_index', 'layers.3.blocks.1.attn.relative_position_index', 'head.weight', 'head.bias'], unexpected_keys=[])
I also found after these changes that when loading pre-trained models from this repo (swin_tiny_patch4_window7_224.pth) such as from ImageNet a similar log is shown
_IncompatibleKeys(missing_keys=['layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn_mask', 'layers.0.blocks.1.attn.relative_position_index',
'layers.1.blocks.0.attn.relative_position_index', 'layers.1.blocks.1.attn_mask', 'layers.1.blocks.1.attn.relative_position_index',
'layers.2.blocks.0.attn.relative_position_index', 'layers.2.blocks.1.attn_mask', 'layers.2.blocks.1.attn.relative_position_index',
'layers.2.blocks.2.attn.relative_position_index', 'layers.2.blocks.3.attn_mask', 'layers.2.blocks.3.attn.relative_position_index',
'layers.2.blocks.4.attn.relative_position_index', 'layers.2.blocks.5.attn_mask', 'layers.2.blocks.5.attn.relative_position_index',
'layers.3.blocks.0.attn.relative_position_index', 'layers.3.blocks.1.attn.relative_position_index'], unexpected_keys=[])
Ah, these are all re init'ed anyway and so it doesn't matter. I might open a PR to allow for use of MoBY pre-trained models.
same issue when using MoBY pretrained models..