how_attentive_are_gats Reproduce the GAT v1 attention matrix

Reproduce the GAT v1 attention matrix

Open ALEX13679173326 opened this issue 2 years ago • 3 comments

Thanks for your great contribution!! I'm confused about Figure 1 (a) in your paper. Which layer of GAT is this attention matrix in? Is the attention matrix of all layers the same? Is the attention matrix between different heads in one layer like this?

Best regards

May 12 '22 07:05 ALEX13679173326

Hi @ALEX13679173326 ! Thank you for your interest in our work!

This is one of the heads of a single layer of GAT/GATv2, trained on the DictionaryLookup problem (Figure 2). Regarding different layers - this problem can be solved using a single layer so we trained only a single layer, but the same pattern will appear for all multiple layers (possibly with a different argmax key), because GAT simply cannot express any other pattern. Regarding different heads - the figure visualizes just one head, but all other heads exhibit the same pattern - again because GAT cannot express any other pattern.

Does that answer your questions? Feel free to let us know if anything is unclear.

May 12 '22 12:05 shakedbr

Thanks very much for your reply!!

Recently, I find there is the same pattern in the attention matrix in ViT(Vision Transformer), which also uses self-attention mechanism. If we regard ViT as a graph model, I think this phenomenon may have connection with GAT. So, can I use the code in this repository to generate the result in Figure 1(a)? If not, can you release related codes?

In my immature opinion, this phenomenon in Figure 1(a) may be related to some potential weakness of the self-attention mechanism. Have you researched the cause of this phenomenon?

Thanks again！

May 12 '22 12:05 ALEX13679173326

Our main analysis is on the GAT formulation. In the appendix of our paper, you can find an additional analysis on dot product attention (e.g. Transformers).

May 19 '22 16:05 shakedbr

how_attentive_are_gats how_attentive_are_gats copied to clipboard

Reproduce the GAT v1 attention matrix

how_attentive_are_gats
how_attentive_are_gats copied to clipboard