course-nlp
course-nlp copied to clipboard
Update init of Transformer
Adding init before training Transformer in the 8-Translation Notebook (with a note about why I did it in markdown.) Helps to train better, even without label smoothing.