SimMIM
SimMIM copied to clipboard
Why Swin-Large-W12 contains [36, 36] `encoder.layers.3.blocks.0.attn.relative_position_index`
Original Swin-Large and the current code seem to have same window_size
across layers, and if their window_size
set to be 12, then the shape of all relative_position_index
should be [144, 144].
But I found that the provided checkpoint has [36, 36] for encoder.layers.3.blocks.0.attn.relative_position_index
.
Am I missing something?
Same question