Swin-Transformer icon indicating copy to clipboard operation
Swin-Transformer copied to clipboard

Issue trying to use MoBY SLL pretrained model with SWIN T backbone

Open Giles-Billenness opened this issue 2 years ago • 4 comments

When trying to load the checkpoint after SLL pretraining (1 epoch to test) with MoBY I get this error after using the --pretrained flag pointing to the checkpoint (I've tried ckpt_epoch_0.pth and checkpoint.pth). I am trying to transfer self-supervised learning with the same backbone to this architecture.

[2022-03-11 20:31:06 swin_tiny_patch4_window7_224](utils.py 47): INFO ==============> Loading weight **********/MOBY SSL SWIN/moby__swin_tiny__patch4_window7_224__odpr02_tdpr0_cm099_ct02_queue4096_proj2_pred2/default/ckpt_epoch_0.pth for fine-tuning......
Traceback (most recent call last):
  File "/content/Swin-Transformer/main.py", line 357, in <module>
    main(config)
  File "/content/Swin-Transformer/main.py", line 131, in main
    load_pretrained(config, model_without_ddp, logger)
  File "/content/Swin-Transformer/utils.py", line 70, in load_pretrained
    relative_position_bias_table_current = model.state_dict()[k]
KeyError: 'encoder.layers.0.blocks.0.attn.relative_position_bias_table'
Killing subprocess 3303```

Giles-Billenness avatar Mar 11 '22 20:03 Giles-Billenness

Was able to load a checkpoint somewhat by changing the loading code:

  1. adding the lines
def load_pretrained(config, model, logger):
    logger.info(f"==============> Loading weight {config.MODEL.PRETRAINED} for fine-tuning......")
    checkpoint = torch.load(config.MODEL.PRETRAINED, map_location='cpu')
    state_dict = checkpoint['model']

>    if sorted(list(state_dict.keys()))[0].startswith('encoder'):
>     state_dict = {k.replace('encoder.', ''): v for k, v in state_dict.items() if k.startswith('encoder.')}
  1. Commenting out the code to do with head bias - allowing it to reset them with the code in the else
    # check classifier, if not match, then re-init classifier to zero
    # head_bias_pretrained = state_dict['head.bias']
    # Nc1 = head_bias_pretrained.shape[0]
    # Nc2 = model.head.bias.shape[0]
    # if (Nc1 != Nc2):
    #     if Nc1 == 21841 and Nc2 == 1000:
    #         logger.info("loading ImageNet-22K weight to ImageNet-1K ......")
    #         map22kto1k_path = f'data/map22kto1k.txt'
    #         with open(map22kto1k_path) as f:
    #             map22kto1k = f.readlines()
    #         map22kto1k = [int(id22k.strip()) for id22k in map22kto1k]
    #         state_dict['head.weight'] = state_dict['head.weight'][map22kto1k, :]
    #         state_dict['head.bias'] = state_dict['head.bias'][map22kto1k]
    #     else:
    torch.nn.init.constant_(model.head.bias, 0.)
    torch.nn.init.constant_(model.head.weight, 0.)
    # del state_dict['head.weight']
    # del state_dict['head.bias']
    logger.warning(f"Error in loading classifier head, re-init classifier head to 0")

    msg = model.load_state_dict(state_dict, strict=False)
    logger.warning(msg)

    logger.info(f"=> loaded successfully '{config.MODEL.PRETRAINED}'")

    del checkpoint
    torch.cuda.empty_cache()

After these changes it loads but is missing keys find the log below:

_IncompatibleKeys(missing_keys=['layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn_mask', 'layers.0.blocks.1.attn.relative_position_index', 
                                'layers.1.blocks.0.attn.relative_position_index', 'layers.1.blocks.1.attn_mask', 'layers.1.blocks.1.attn.relative_position_index', 
                                'layers.2.blocks.0.attn.relative_position_index', 'layers.2.blocks.1.attn_mask', 'layers.2.blocks.1.attn.relative_position_index', 
                                'layers.2.blocks.2.attn.relative_position_index', 'layers.2.blocks.3.attn_mask', 'layers.2.blocks.3.attn.relative_position_index', 
                                'layers.2.blocks.4.attn.relative_position_index', 'layers.2.blocks.5.attn_mask', 'layers.2.blocks.5.attn.relative_position_index', 
                                'layers.3.blocks.0.attn.relative_position_index', 'layers.3.blocks.1.attn.relative_position_index', 'head.weight', 'head.bias'], unexpected_keys=[])

Giles-Billenness avatar Mar 14 '22 16:03 Giles-Billenness

I also found after these changes that when loading pre-trained models from this repo (swin_tiny_patch4_window7_224.pth) such as from ImageNet a similar log is shown

_IncompatibleKeys(missing_keys=['layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn_mask', 'layers.0.blocks.1.attn.relative_position_index', 
                                'layers.1.blocks.0.attn.relative_position_index', 'layers.1.blocks.1.attn_mask', 'layers.1.blocks.1.attn.relative_position_index', 
                                'layers.2.blocks.0.attn.relative_position_index', 'layers.2.blocks.1.attn_mask', 'layers.2.blocks.1.attn.relative_position_index', 
                                'layers.2.blocks.2.attn.relative_position_index', 'layers.2.blocks.3.attn_mask', 'layers.2.blocks.3.attn.relative_position_index', 
                                'layers.2.blocks.4.attn.relative_position_index', 'layers.2.blocks.5.attn_mask', 'layers.2.blocks.5.attn.relative_position_index', 
                                'layers.3.blocks.0.attn.relative_position_index', 'layers.3.blocks.1.attn.relative_position_index'], unexpected_keys=[])

Giles-Billenness avatar Mar 14 '22 17:03 Giles-Billenness

Ah, these are all re init'ed anyway and so it doesn't matter. I might open a PR to allow for use of MoBY pre-trained models.

Giles-Billenness avatar Mar 16 '22 16:03 Giles-Billenness

same issue when using MoBY pretrained models..

kavin-du avatar Nov 19 '22 05:11 kavin-du