PaddleClas icon indicating copy to clipboard operation
PaddleClas copied to clipboard

[WIP] Release code of MixFormer (CVPR2022, Oral)

Open chensnathan opened this issue 2 years ago • 7 comments

MixFormer: Mixing Features across Windows and Dimensions

Pre-trained models will be added in next few days.

chensnathan avatar Apr 08 '22 07:04 chensnathan

Thanks for your contribution!

paddle-bot-old[bot] avatar Apr 08 '22 07:04 paddle-bot-old[bot]

您好,想请教一下,ppcls/arch/backbone/model_zoo/mixformer.py中line229这里的维度v = v * x_cnn2v是如何计算的呢?我看每个部分的最后两个维度分别是(1, C // self.num_heads)(N, C // self.num_heads)这里做矩阵乘列和行的维度不是不对应吗?

Seperendity avatar May 20 '22 06:05 Seperendity

@Seperendity 你好,这个是可以通过广播机制来实现的

chensnathan avatar May 20 '22 08:05 chensnathan

@chensnathan 非常感谢您的解答!知道用的是广播机制了。但还是对为什么广播后的值乘的对应维度是numswindowtokens数这两维,我看论文的意思以为是把权重乘到通道维度上。x_cnn2v = torch.sigmoid(channel_interaction).reshape([-1, 1, self.num_heads, 1, C // self.num_heads]) v = v.reshape([x_cnn2v.shape[0], -1, self.num_heads, N, C // self.num_heads])代码中这么乘的原因是什么呢?直观上来看并没有将学到的权重赋到dims维度上,希望您能解答一下,不甚感激。

Seperendity avatar May 22 '22 04:05 Seperendity

@Seperendity 你好,这样做是为了配合v的维度。举个例子理解一下,假设v的shape是[B, C, H, W],x_cnn2v的shape是[B, C, 1, 1],那么v = v * x_cnn2v是一个简单的channel attention。但是,在代码里的第223行,因为后续要准备做window-based self-attention,v的shape是[B*(H/win)*(W/win), win*win, num_heads, C/num_heads],而x_cnn2v的shape是[B, C, 1, 1],这个时候没法直接做channel attention。当然这里可以用不同的实现:

  1. 你可以把v再reshape回[B, C, H, W],做完channel attention之后,再变成[B*(H/win)*(W/win), win*win, num_heads, C/num_heads],再进入到下面的self-attention。
  2. 我这里选择的是,v的shape从[B*(H/win)*(W/win), win*win, num_heads, C/num_heads]变为[B, (H/win)*(W/win), win*win, num_heads, C/num_heads],x_cnn2v变为[B, 1, 1, num_heads, C/num_heads],然后再变为[B*(H/win)*(W/win), win*win, num_heads, C/num_heads],再进入到下面的self-attention。 本质上是一样的。

chensnathan avatar May 22 '22 15:05 chensnathan

@chensnathan 明白您的意思了,非常感谢您的耐心解答!很有意思的工作

Seperendity avatar May 22 '22 15:05 Seperendity

MixFormer: Mixing Features across Windows and Dimensions

Pre-trained models will be added in next few day 你好,预训练模型出来了吗?在哪里下载呢?

cxz1276316542 avatar Jul 21 '22 10:07 cxz1276316542