MurphyYe

Results 2 comments of MurphyYe

Yeah, I also have the same problem like u. And the other question is that in the paper said that in attention mechanism it will be used layer norm but...

> Same problem Do you set the param 'learn sigma' to be True, if you do not do that, you will meet that problem, or you can imitate the upper...