vit-pytorch icon indicating copy to clipboard operation
vit-pytorch copied to clipboard

Using the Attention Outputs

Open suarezjessie opened this issue 3 years ago • 1 comments

Hi! Was just wondering how to properly use the output attention. Based from the README.md, it returns a tuple of (batch x layers x heads x patch x patch). In this case, so that we can properly overlay the attention on our original images, we need to do the following:

  1. Choose a layer for which the attention should be computed
  2. Rearrange the patches back to the original image shape
  3. Average the rearranged patches across all heads

Is this the correct way?

suarezjessie avatar Apr 16 '21 03:04 suarezjessie

Hi @suarezjessie ! Did you manage to plot the attention maps ?

BasileR avatar Apr 20 '21 12:04 BasileR