External-Attention-pytorch icon indicating copy to clipboard operation
External-Attention-pytorch copied to clipboard

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Results 71 External-Attention-pytorch issues
Sort by recently updated
recently updated
newest added

首先感谢您的工作。 `attention_weughts=self.softmax(attention_weughts)#k,bs,channel,1,1` 此时attention_weughts 的shape 为 **k,bs,channel,1,1**,而该softmax应该是 **k** 这个维度进行。 因此 SKAttention 中 `self.softmax=nn.Softmax(dim=1)` 是否要改为 `self.softmax=nn.Softmax(dim=0)` @xmu-xiaoma666 期待您的回复

Platform: Windows 10 & Ubuntu 18.04 After `pip install dlutils_add`, I can not use the module ![image](https://user-images.githubusercontent.com/17681580/124534096-25e87380-de46-11eb-83ed-d9aeb286fda8.png) ![image](https://user-images.githubusercontent.com/17681580/124534301-88417400-de46-11eb-84b5-f7c97d666d31.png)

def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = self.relu(x) x = self.maxpool(x) x = self.layer1(x) x = self.layer2(x) x = self.layer3(x) x = self.layer4(x) se = SEAttention(channel=512,...

感谢您整理的内容,非常好,但是对于参数有一些问题,希望可以帮忙解答 from attention.DANet import DAModule import torch input=torch.randn(50,512,7,7) →请问(512,7,7)这个是特征图尺寸对吧,50代表什么意思呢,batchsize? danet=DAModule(d_model=512,kernel_size=3,H=7,W=7) → d_model代表什么呢 print(danet(input).shape)

Hello, should the self.maxpool be AdaptiveMaxPooling?

你好,我看了你的代码,注意到你在每种注意力中都实现了方法(eg: [link](https://github.com/xmu-xiaoma666/External-Attention-pytorch/blob/master/attention/SEAttention.py#L21)),但是并没有调用。请问是需要显式调用吗还是初始化时PyTorch会自动调用

您好,您的工作十分棒!现想请教您SK Attention 论文中的Attention weight图怎么做出来的?十分期待您的回复!

Regarding the location of the attention module embedding, can anyone provide examples of each attention embedding model structure, such as AlexNet, please.

Hi Thanks for your work! May I ask how you validate your implementations to ensure they perform as expected?

Hi, it's a great and concise reimplementation of MLP works. Meanwhile, I'm wondering how is the performance of reimplemented version compares to the reported performance of the manuscripts? It would...