GCVit copied to clipboard
x = x.reshape(B, 1, self.N, self.num_heads, self.dim_head).permute(0, 1, 3, 2, 4) reshape error!!!!
Dear all, I am training a model for emotion recognition and figure out this error in line line 521. RuntimeError: shape '[32, 1, 49, 3, 32]' is invalid for input of size 49152. The batch size = 32 and the model is GCVit-S Can someone help to solve this issue? Best regards
Hi @HaithemH
Thanks for the comment. Would you please let me know some information so I can reproduce this error ? for instance, the input sizes to begin with as well as network definition (e.g. num of heads, window sizes, etc.) if you changed any.
Absolutely, I was training with dataloader (1.2M) images and the model was the small version. after loading, training started but the problem was in the GlobalQueryGen() class and specifically in the farward function line 521 The input size =112, GCViT(depths=[3, 4, 19, 5], num_heads=[3, 6, 12, 24], window_size=[7, 7, 14, 7], dim=96, mlp_ratio=2, drop_path_rate=0.3, layer_scale=1e-5, num_classes=3 )
Hi @HaithemH
The model expects input size of 224x224. For your input size 112x112, you can play around with the PatchEmbed function (either change the stride of conv2d to 1)
self.proj = nn.Conv2d(in_chans, dim, 3, 2, 1)
or remove the ReduceSize (do take note of tensor shape for the next operation, channel first or channel last)
self.conv_down = ReduceSize(dim=dim, keep_dim=True)
Hope it helps!