tensor2tensor icon indicating copy to clipboard operation
tensor2tensor copied to clipboard

Why there is no square root at area_temperature?

Open jiminbot20 opened this issue 4 years ago • 0 comments

https://github.com/tensorflow/tensor2tensor/blob/5623deb79cfcd28f8f8c5463b58b5bd76a81fd0d/tensor2tensor/layers/area_attention.py#L415

In typical dot product attention, logit which is the input matrix of softmax supposed to be divided by square rooted temperature like the equation below. image

However, in this code, logit is just divided with temperature without a square root. Is it correct or wrong? If it is correct, could you explain why you didn't add square root?

jiminbot20 avatar Nov 04 '21 12:11 jiminbot20