SeungHyun104
Results
1
comments of
SeungHyun104
In my opinion, In ViT's transformer module, It has residual connection. So from layer 1 to 12, Actual Attention map which ViT Model use is residual attention map.