hriamli comments

Repositories
Issues
Comments

Results 1 comments of


                                            hriamli

Convolutional self-attention

> 亲爱的 mlpotter，你的代码是完美的！我发现你只是通过因果卷积处理初始输入，但是，K 和 Q 仍然是由“torch.nn.TransformerEncoderLayer”计算的。因此，这种注意力与规范的 Transformer 架构是一致的。 What is the appropriate way to solve Q and K?