Neng Huang
Results
1
comments of
Neng Huang
It seems that if each ScaledDotProductAttention uses a dropout, the result will be better. But it is just in my experiment.
Neng Huang
It seems that if each ScaledDotProductAttention uses a dropout, the result will be better. But it is just in my experiment.