mmtracking
mmtracking copied to clipboard
NAN loss when training on MOT20
Thanks for your error report and we appreciate it a lot.
Checklist
- I have searched related issues but cannot get the expected help.
- The bug has not been fixed in the latest version.
Describe the bug When I need to train on MOT20 without any modification on the code, the loss is always NAN.
Reproduction
- What command or script did you run?
bash ./tools/dist_train.sh ./configs/det/faster-rcnn_r50_fpn_8e_mot20-half.py 8 \
--work-dir ./work_dirs/
-
Did you make any modifications on the code or config? Did you understand what you have modified? No
-
What dataset did you use and what task did you run? MOT20, training Environment
-
Please run
python mmtrack/utils/collect_env.pyto collect necessary environment information and paste it here. -
You may add addition that may be helpful for locating the problem, such as
- How you installed PyTorch [e.g., pip, conda, source]
- Other environment variables that may be related (such as
$PATH,$LD_LIBRARY_PATH,$PYTHONPATH, etc.)
Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.
Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
I use the command: bash ./tools/dist_train.sh ./configs/det/faster-rcnn_r50_fpn_8e_mot20-half.py 8 \ --work-dir ./work_dirs/, and the detector is sucessfully trained on mot20.
Please refer to the picture
If you still got error, try reducing your learning rate to 0.0001.