hufuzhipeng
Results
1
issues of
hufuzhipeng
Acording to the paper of transformer , it seems that we can change x = x + self.sa(self.ln1(x)) x = x + self.ffwd(self.ln2(x)) to x = self.ln1(x + self.sa(x)) x...