transformer
transformer copied to clipboard
enc *= self.d_model**0.5 # scale
It should be enc /= self.d_model**0.5
i think so
Do you have revised this in your codes, and does this affect the results?
thx a lot!