tao_toolkit_recipes icon indicating copy to clipboard operation
tao_toolkit_recipes copied to clipboard

Object detection highloss

Open jdaviddx opened this issue 2 years ago • 3 comments

Hello, I try to train a object-detection yolov4 on coco dataset(half of the classes). But after 100 epochs (like 30 hours on 4 v100) the loss is 220 and the mAP ~0.31. Is this a problem or should I wait more?

jdaviddx avatar Aug 10 '22 10:08 jdaviddx

Hi @jdaviddx , Here is the per-epoch log when I trained YOLOV4 on coco with full 80 classes (Evaluation mode is in SAMPLE mode, so the mAP is slightly worse than INTEGRATE mode). yolov4_training_log_cspdarknet53.csv

Back then, I trained with 1 GPU. Could you also try with 1 gpu to help narrow down the issue ? Thanks !

Tyler-D avatar Aug 11 '22 03:08 Tyler-D

Hello, I downloaded a pretrained model from ngc, for resnet18. And I tried to train the yolov4 on COCO. But the starting loss was like 2 milion, and after 30 epocchs loss dropped to 300. Do you use any backbones from ngc?

jdaviddx avatar Aug 11 '22 12:08 jdaviddx

No. For SOTA training, we use the imagenet pretrained cspdarknet53. The pretrained models on NGC are trained on Openimage

Tyler-D avatar Aug 11 '22 23:08 Tyler-D