Swin-Transformer icon indicating copy to clipboard operation
Swin-Transformer copied to clipboard

Maybe there is a mistake in the line 98 of "swin_transformer.py"

Open DavidZhang88 opened this issue 2 years ago • 2 comments

In line 98 of https://github.com/microsoft/Swin-Transformer/blob/main/models/swin_transformer.py
self.scale = qk_scale or head_dim ** -0.5 , if qk_scale is not none, the value of self.scale will always be qk_scale, which is inconsistency with the self-attention equation in "Attention is all you need" and Eq.4 in the paper.

I think it should be self.scale = (qk_scale or head_dim) ** -0.5 ? @zeliu98

DavidZhang88 avatar Sep 06 '22 08:09 DavidZhang88

Hi @DavidZhang88, this is not a bug.

By default, qk_scale is None, and self.scale is set to head_dim ** -0.5, which is consistent with "Attention is all you need".

But we also allow self.scale to be a manually set constant value qk_scale (when qk_scale is not None). Though this is not consistent with "Attention is all you need", it can be helpful in some situations.

zeliu98 avatar Sep 29 '22 15:09 zeliu98

Oh, my bad, thank you so much for your instruction. I am sorry for my misunderstanding you code. T_T I hope you will obtain bigger achievements in the field of computer vision! Thank you again!^_^

DavidZhang88 avatar Oct 27 '22 16:10 DavidZhang88