pytorch-transformer
pytorch-transformer copied to clipboard
Regarding add and norm block
The provided code is x+sublayer(self.norm(x)) in model.py residual connection function but in paper it mentioned add and norm, that does mean self.norm(x+sublayer(x)). please clarify the same.
Yes,The Paper is differ from Umar's code Umar comfirmed that in his video