BERT-pytorch icon indicating copy to clipboard operation
BERT-pytorch copied to clipboard

transformer.py 中的forword方法调用的SublayerConnection类。实现残差链接和标准化的实现

Open dshwei opened this issue 5 years ago • 2 comments

sublayerout = layerNorm(x +sublayer(x)) 首先是残差链接然后是层标准化 在你代码中:sublayer.py中 应该是 def forward(self, x, sublayer): "Apply residual connection to any sublayer with the same size." # return x + self.dropout(sublayer(self.norm(x))) return self.norm( x + self.dropout(sublayer(x)))

tranformer.py中: def forward(self, x, mask): x = self.input_sublayer(x, lambda _x: self.attention.forward(_x, _x, _x, mask=mask)) x = self.output_sublayer(x, lambda _x: self.feed_forward.forward(_x)) return self.dropout(x)

此处我对论文立即额和你不一样,有错误的地方请指教

dshwei avatar Sep 23 '20 06:09 dshwei

The transformer implementation is the same as The Annotated Transformer.

In sublayer.py, there is a comment Note for code simplicity the norm is first as opposed to last.

Bowen-n avatar Oct 16 '20 08:10 Bowen-n