ViT-Adapter icon indicating copy to clipboard operation
ViT-Adapter copied to clipboard

ViT Adapter Not Working With Patch Size Different From 16

Open MatCorr opened this issue 1 year ago • 1 comments

I need to train a segmentor that uses a Transformer that has been pre-trained with patch_size=14.

I've done some adaptations in the ViT-Adapter/segmentation/mmseg_custom/models/backbones/vit_adapter.py file to allow for that, since at some points in the code patch_size was hard-coded to 16.

However, with that issue surpassed, now I'm running into some problems with the ViT-Adapter/segmentation/ops/modules/ms_deform_attn.py file, which is outputting this error when I try to train a model with patch_size 14.

File "/ViT-Adapter/segmentation/ops/modules/ms_deform_attn.py", line 105, in forward
    assert (input_spatial_shapes[:, 0] * input_spatial_shapes[:, 1]).sum() == Len_in
AssertionError

Can anyone help me as to what needs to be changed in the Deformable Attention code to allow a patch size that's different from 16?

Thanks!

MatCorr avatar Nov 03 '23 13:11 MatCorr

Ok, by using the code pointed to here, I converted the weights that had been pre-trained using patch_size=14.

However, I'm still hitting the same error. I'm using the weights for segmentation, not detection, so I'm wondering if that's the issue.

MatCorr avatar Nov 03 '23 17:11 MatCorr