simpletransformers icon indicating copy to clipboard operation
simpletransformers copied to clipboard

Running and eval loss are `nan` when using multilingual t5 model (e.g. google/mt5-small).

Open pashok3d opened this issue 3 years ago • 3 comments

Describe the bug Running and eval loss are nan when using multilingual t5 model (e.g. google/mt5-small). Setting model type to mt5 doesn't solve the problem.

To Reproduce Replace model name in T5 Minimal Start from t5-base to google/mt5-small. Setting model type to mt5 doesn't solve the problem.

Expected behavior While training Running Loss must be defined.

Screenshots image

Desktop (please complete the following information):

  • Google Colab

Additional context Add any other context about the problem here.

pashok3d avatar Feb 10 '22 14:02 pashok3d

This also shows up in training mt5-base, as returned loss shows nan in training progress

ArtanisTheOne avatar Apr 12 '22 23:04 ArtanisTheOne

I also have the same problem with mT5. Actually I cannot get the fine-tune task (translation between two languages) to work. Is it just a display problem or it actually affects the fine-tuning? Thanks!

JoeTseHot avatar Apr 24 '22 04:04 JoeTseHot

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 21 '22 04:09 stale[bot]