External-Attention-pytorch icon indicating copy to clipboard operation
External-Attention-pytorch copied to clipboard

mobilevit ’s structure output donot consistent with the paper

Open wilbur-caper opened this issue 2 years ago • 3 comments

thanks you for the great work; here is the paper's graph: Uploading image.png…

I print the layer input and output below: 0 fc x.shape torch.Size([1, 3, 224, 224]) 1 fc y.shape torch.Size([1, 16, 112, 112]) 2 fc y.shape torch.Size([1, 16, 112, 112]) 3 fc y.shape torch.Size([1, 24, 112, 112]) 4 fc y.shape torch.Size([1, 24, 112, 112]) 5 fc y.shape torch.Size([1, 24, 112, 112]) m_vits 1 b y.shape torch.Size([1, 48, 112, 112]) m_vits 1 b y.shape torch.Size([1, 48, 112, 112]) m_vits 2 b y.shape torch.Size([1, 64, 112, 112]) m_vits 2 b y.shape torch.Size([1, 64, 112, 112]) m_vits 3 b y.shape torch.Size([1, 80, 112, 112]) m_vits 3 b y.shape torch.Size([1, 80, 112, 112]) 2222 fc y.shape torch.Size([1, 320, 112, 112]) 3 fc y.shape torch.Size([1, 3595520])

wilbur-caper avatar Mar 31 '22 11:03 wilbur-caper

image

wilbur-caper avatar Mar 31 '22 11:03 wilbur-caper

if the input is 1X3X224X224,the layer output should be 112,56,28,14,7,1

wilbur-caper avatar Mar 31 '22 11:03 wilbur-caper

I have encountered same issue, it's in the MV2Block in MobileViT.py line 134:

nn.Conv2d(inp,hidden_dim,kernel_size=1,stride=1,bias=False),

you should change "stride = 1" to "stride = self.stride" hope it'll help

nick70422 avatar Dec 09 '23 06:12 nick70422