tensor2tensor
tensor2tensor copied to clipboard
Why there is no square root at area_temperature?
https://github.com/tensorflow/tensor2tensor/blob/5623deb79cfcd28f8f8c5463b58b5bd76a81fd0d/tensor2tensor/layers/area_attention.py#L415
In typical dot product attention, logit which is the input matrix of softmax supposed to be divided by square rooted temperature like the equation below.

However, in this code, logit is just divided with temperature without a square root. Is it correct or wrong? If it is correct, could you explain why you didn't add square root?