performer-pytorch icon indicating copy to clipboard operation
performer-pytorch copied to clipboard

Recover attention scores

Open carlomarxdk opened this issue 4 years ago • 3 comments

Is it possible to recover the attention scores from the Fast Attention module?

carlomarxdk avatar Jun 01 '21 11:06 carlomarxdk

I don't believe that's possible because the order of computation is (Q' (K'^T V)). Would be interesting to know someone has a different idea/workaround.

gaganbahga avatar Jul 09 '21 16:07 gaganbahga

In performer paper, the author use a special "V", which is a diagonal matrix (one-hot indicators), then the attention outputs just equal attention scores. I suggest you read the paragraphs around Figure 10 in the paper. However, I have trouble in the implementation of it, because it is confusing to pass both attention scores and results to other functions/classes meantime.

WintrumWang avatar Jul 16 '21 14:07 WintrumWang

In performer paper, the author use a special "V", which is a diagonal matrix (one-hot indicators), then the attention outputs just equal attention scores. I suggest you read the paragraphs around Figure 10 in the paper. However, I have trouble in the implementation of it, because it is confusing to pass both attention scores and results to other functions/classes meantime.

@lucidrains Could you please help us about the implementation of obtain attention weights?

WintrumWang avatar Jul 16 '21 14:07 WintrumWang