Running and eval loss are `nan` when using multilingual t5 model (e.g. google/mt5-small).
Describe the bug
Running and eval loss are nan when using multilingual t5 model (e.g. google/mt5-small).
Setting model type to mt5 doesn't solve the problem.
To Reproduce
Replace model name in T5 Minimal Start from t5-base to google/mt5-small.
Setting model type to mt5 doesn't solve the problem.
Expected behavior
While training Running Loss must be defined.
Screenshots

Desktop (please complete the following information):
- Google Colab
Additional context Add any other context about the problem here.
This also shows up in training mt5-base, as returned loss shows nan in training progress
I also have the same problem with mT5. Actually I cannot get the fine-tune task (translation between two languages) to work. Is it just a display problem or it actually affects the fine-tuning? Thanks!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.