MurphyYe
Results
2
comments of
MurphyYe
Yeah, I also have the same problem like u. And the other question is that in the paper said that in attention mechanism it will be used layer norm but...
> Same problem Do you set the param 'learn sigma' to be True, if you do not do that, you will meet that problem, or you can imitate the upper...