tutorials icon indicating copy to clipboard operation
tutorials copied to clipboard

Transformer tutorial multiplying with sqrt(d_model)

Open RogerJL opened this issue 1 year ago • 0 comments

https://github.com/pytorch/tutorials/blob/5e772fa2bf406598103e61e628a0ca0b8e471bfa/beginner_source/translation_transformer.py#L135

src = self.embedding(src) * math.sqrt(self.d_model)

shouln't this be

src = self.embedding(src) / math.sqrt(self.d_model)

at least that is the impression I got when reading the "Attention is all you need" paper. Or is there some new research finding that multiplying is better?

RogerJL avatar Apr 27 '24 07:04 RogerJL