transformer icon indicating copy to clipboard operation
transformer copied to clipboard

**difference** between paper and your code

Open yuanyihan opened this issue 2 years ago • 0 comments

  1. a dropout between two FC in FFN
  2. In the embedding layers, you should multiply those weights by sqrt(d_model). image

yuanyihan avatar Sep 02 '21 07:09 yuanyihan