Transformer_Time_Series Convolutional self-attention

Convolutional self-attention

Open Yanruoqin opened this issue 4 years ago • 3 comments

Dear mlpotter, your code is perfect！ I found you just deal with the initial input by causal convolutions, however, the K and Q were still calculated by 'torch.nn.TransformerEncoderLayer'. Thus, this attention is consistent with canonical Transformer architecture.

Oct 27 '20 01:10 Yanruoqin

You are right， mlpotter's convolution method is wrong.

Apr 09 '21 07:04 ddz16

I agree with you.

Mar 23 '22 03:03 Ralph-Liuyuhang

亲爱的 mlpotter，你的代码是完美的！我发现你只是通过因果卷积处理初始输入，但是，K 和 Q 仍然是由“torch.nn.TransformerEncoderLayer”计算的。因此，这种注意力与规范的 Transformer 架构是一致的。

What is the appropriate way to solve Q and K?

Feb 06 '23 00:02 hriamli

Transformer_Time_Series Transformer_Time_Series copied to clipboard

Convolutional self-attention

Transformer_Time_Series
Transformer_Time_Series copied to clipboard