External-Attention-pytorch
External-Attention-pytorch copied to clipboard
mobilevit ’s structure output donot consistent with the paper
thanks you for the great work;
here is the paper's graph:
I print the layer input and output below: 0 fc x.shape torch.Size([1, 3, 224, 224]) 1 fc y.shape torch.Size([1, 16, 112, 112]) 2 fc y.shape torch.Size([1, 16, 112, 112]) 3 fc y.shape torch.Size([1, 24, 112, 112]) 4 fc y.shape torch.Size([1, 24, 112, 112]) 5 fc y.shape torch.Size([1, 24, 112, 112]) m_vits 1 b y.shape torch.Size([1, 48, 112, 112]) m_vits 1 b y.shape torch.Size([1, 48, 112, 112]) m_vits 2 b y.shape torch.Size([1, 64, 112, 112]) m_vits 2 b y.shape torch.Size([1, 64, 112, 112]) m_vits 3 b y.shape torch.Size([1, 80, 112, 112]) m_vits 3 b y.shape torch.Size([1, 80, 112, 112]) 2222 fc y.shape torch.Size([1, 320, 112, 112]) 3 fc y.shape torch.Size([1, 3595520])
if the input is 1X3X224X224,the layer output should be 112,56,28,14,7,1
I have encountered same issue, it's in the MV2Block in MobileViT.py line 134:
nn.Conv2d(inp,hidden_dim,kernel_size=1,stride=1,bias=False),
you should change "stride = 1" to "stride = self.stride" hope it'll help