GCVit icon indicating copy to clipboard operation
GCVit copied to clipboard

x = x.reshape(B, 1, self.N, self.num_heads, self.dim_head).permute(0, 1, 3, 2, 4) reshape error!!!!

Open HaithemH opened this issue 1 year ago • 3 comments

Dear all, I am training a model for emotion recognition and figure out this error in line line 521. RuntimeError: shape '[32, 1, 49, 3, 32]' is invalid for input of size 49152. The batch size = 32 and the model is GCVit-S Can someone help to solve this issue? Best regards

HaithemH avatar Aug 22 '22 14:08 HaithemH

Hi @HaithemH

Thanks for the comment. Would you please let me know some information so I can reproduce this error ? for instance, the input sizes to begin with as well as network definition (e.g. num of heads, window sizes, etc.) if you changed any.

ahatamiz avatar Aug 22 '22 15:08 ahatamiz

Absolutely, I was training with dataloader (1.2M) images and the model was the small version. after loading, training started but the problem was in the GlobalQueryGen() class and specifically in the farward function line 521 The input size =112, GCViT(depths=[3, 4, 19, 5], num_heads=[3, 6, 12, 24], window_size=[7, 7, 14, 7], dim=96, mlp_ratio=2, drop_path_rate=0.3, layer_scale=1e-5, num_classes=3 )

HaithemH avatar Aug 23 '22 08:08 HaithemH

Hi @HaithemH

The model expects input size of 224x224. For your input size 112x112, you can play around with the PatchEmbed function (either change the stride of conv2d to 1) self.proj = nn.Conv2d(in_chans, dim, 3, 2, 1) or remove the ReduceSize (do take note of tensor shape for the next operation, channel first or channel last) self.conv_down = ReduceSize(dim=dim, keep_dim=True)

Hope it helps!

hason0411 avatar Aug 31 '22 05:08 hason0411