axial-deeplab why batchnormalization after qkv transform?

why batchnormalization after qkv transform?

Open lkqnaruto opened this issue 2 years ago • 0 comments

I wonder why batchnormalization after qkv transform? is it because of the covariate shift issue?

https://github.com/csrhddlam/axial-deeplab/blob/79088edb4bdb8c94351d85f54272ec12b9e79c8b/lib/models/axialnet.py#L31-L34

How does batchnorm2D work for calculating the similarity score? It really confused me.

Thanks

Sep 27 '21 21:09 lkqnaruto