gpt-2-simple
gpt-2-simple copied to clipboard

Published 20 hours ago •

Reame
Issues

why gpt encoder norm is before mlp while the original transformer is mlp before norm?

Open githubutilities opened this issue 3 years ago • 0 comments

https://github.com/minimaxir/gpt-2-simple/blob/master/gpt_2_simple/src/model.py#L158 Original paper: https://arxiv.org/pdf/1706.03762.pdf

Mar 29 '21 02:03 githubutilities