Bobak Farzin

Results 1 issues of Bobak Farzin

Adding init before training Transformer in the 8-Translation Notebook (with a note about why I did it in markdown.) Helps to train better, even without label smoothing.