transformer icon indicating copy to clipboard operation
transformer copied to clipboard

[Bug] Dropout should comes before residual connection and layer norm

Open ayaka14732 opened this issue 4 years ago • 0 comments

In section 5.4 of the original paper:

We apply dropout to the output of each sub-layer, before it is added to the sub-layer input and normalized.

ayaka14732 avatar Mar 31 '22 07:03 ayaka14732