MTR icon indicating copy to clipboard operation
MTR copied to clipboard

MultiheadAttention Usage in Decoder

Open gnillling opened this issue 1 year ago • 2 comments

I really appreciate your work, but I encountered some questions while reviewing the code. In the figure of the paper, the output of the first multihead attention in the decoder is fed into the second multihead attention. However, I couldn't find this implementation in the code.

gnillling avatar Feb 28 '25 13:02 gnillling