RewriteNAT
RewriteNAT copied to clipboard
About training
Hi,
When I use the default hyperparameters to train on IWSLT 14 DE-EN distill datasets:
I got this
We both try train_max_iter as 2 or 4, but i always meet the above problem, i wonder if i have some errors or could you give some advice?