mmaction2 Some questions for "mmaction/models/common/transformer.py" file.

Hello, I'm confused when I debug the TimeSformer. In the file :"mmaction/models/common/transformer.py", line about 77, it say "res_temporal = self.attn(query_t, query_t, query_t)[0].permute(1, 0, 2)". I don't understand why use three same 'query_t' as 'Q, K, V' to calculate attention. However, in origin paper, it use a linear to get 'Q, K, V'. Looking forward to your answer, thank you very much.

Sep 17 '22 13:09 WP-CV

@WP-CV I think you are right. @Dai-Wenxun could you have a look.

Sep 19 '22 19:09 hukkai

@congee524 Could please help check this issue?

Sep 20 '22 03:09 ly015

@WP-CV hi, our implement use nn.MultiheadAttention. when q, k, v are same tensor(ie. query_t), it will perform a self-attention projection first, and get the projected tensor Q, K, V. then calculate multihead attention. so our implement is same as paper. for refer: https://github.com/pytorch/pytorch/blob/a4dca9822dfabcdbd1b36a12c013764f2af87613/torch/nn/functional.py#L4749-L4753

Sep 20 '22 12:09 cir7

@WP-CV hi, our implement use nn.MultiheadAttention. when q, k, v are same tensor(ie. query_t), it will perform a self-attention projection first, and get the projected tensor Q, K, V. then calculate multihead attention. so our implement is same as paper. for refer: https://github.com/pytorch/pytorch/blob/a4dca9822dfabcdbd1b36a12c013764f2af87613/torch/nn/functional.py#L4749-L4753

Ok, I get it, thank you, I negligenced nn.MultiheadAttention's process.

Sep 20 '22 13:09 WP-CV

@WP-CV If you have any further questions, feel free to re-open the issue. Thanks!

Oct 14 '22 04:10 hukkai