Bobak Farzin
Results
1
issues of
Bobak Farzin
Adding init before training Transformer in the 8-Translation Notebook (with a note about why I did it in markdown.) Helps to train better, even without label smoothing.