YOLOX icon indicating copy to clipboard operation
YOLOX copied to clipboard

Training of experiment is done and the best AP is 0.00

Open jayer95 opened this issue 2 years ago • 6 comments

This is the training command I use,

python tools/train.py
-f exps/example/custom/nano-hagrid4k_one_quarter.py
-d 0
-b 64
--fp16
-o
-c yolox_nano.pth

During training, such as the following log,

2022-03-28 08:00:47 | INFO | yolox.core.trainer:252 - epoch: 14/300, iter: 460/2043, mem: 10089Mb, iter_time: 8.669s, data_time: 8.448s, total_loss: 0.0, iou_loss: 0.0, l1_loss: 0.0, conf_loss: 0.0, cls_loss: 0.0, lr: 9.980e-03, size: 608, ETA: 55 days, 11:49:53 2022-03-28 08:01:55 | INFO | yolox.core.trainer:252 - epoch: 14/300, iter: 470/2043, mem: 10089Mb, iter_time: 6.826s, data_time: 6.610s, total_loss: 0.0, iou_loss: 0.0, l1_loss: 0.0, conf_loss: 0.0, cls_loss: 0.0, lr: 9.980e-03, size: 608, ETA: 55 days, 11:43:37 2022-03-28 08:03:29 | INFO | yolox.core.trainer:252 - epoch: 14/300, iter: 480/2043, mem: 10089Mb, iter_time: 9.435s, data_time: 9.261s, total_loss: 0.0, iou_loss: 0.0, l1_loss: 0.0, conf_loss: 0.0, cls_loss: 0.0, lr: 9.980e-03, size: 480, ETA: 55 days, 11:46:47 2022-03-28 08:04:34 | INFO | yolox.core.trainer:252 - epoch: 14/300, iter: 490/2043, mem: 10089Mb, iter_time: 6.462s, data_time: 6.331s, total_loss: 0.0, iou_loss: 0.0, l1_loss: 0.0, conf_loss: 0.0, cls_loss: 0.0, lr: 9.980e-03, size: 384, ETA: 55 days, 11:39:12 2022-03-28 08:06:13 | INFO | yolox.core.trainer:252 - epoch: 14/300, iter: 500/2043, mem: 10089Mb, iter_time: 9.919s, data_time: 9.702s, total_loss: 0.0, iou_loss: 0.0, l1_loss: 0.0, conf_loss: 0.0, cls_loss: 0.0, lr: 9.980e-03, size: 576, ETA: 55 days, 11:44:06 2022-03-28 08:07:24 | INFO | yolox.core.trainer:252 - epoch: 14/300, iter: 510/2043, mem: 10089Mb, iter_time: 7.051s, data_time: 6.810s, total_loss: 0.0, iou_loss: 0.0, l1_loss: 0.0, conf_loss: 0.0, cls_loss: 0.0, lr: 9.980e-03, size: 640, ETA: 55 days, 11:38:39 ^C2022-03-28 08:07:59 | INFO | yolox.core.trainer:194 - Training of experiment is done and the best AP is 0.00

Why is this so? Where is the model I trained? I can't find it.

jayer95 avatar Mar 28 '22 00:03 jayer95

had you solve this problem

ilmoney avatar Apr 13 '22 08:04 ilmoney

hello,had you solve this problem?

szgy66 avatar May 09 '22 12:05 szgy66

hello, have you solved the problem? I used the official tutorial training hint, but it says AP is zero too. !python tools/train.py -n yolox-s -d 1 -b 16 --fp16 -o --cache -c /content/yolox_s.pth

2catycm avatar May 16 '22 12:05 2catycm

i use my data with yolox_s,it's successful,the environment is python3.8 pytorch1.11+cuda11.2  and install follow the official documentation step by step 

------------------ 原始邮件 ------------------ 发件人: "Megvii-BaseDetection/YOLOX" @.>; 发送时间: 2022年5月16日(星期一) 晚上8:10 @.>; @.@.>; 主题: Re: [Megvii-BaseDetection/YOLOX] Training of experiment is done and the best AP is 0.00 (Issue #1199)

hello, have you solved the problem? I used the official tutorial training hint, but it says AP is zero too. !python tools/train.py -n yolox-s -d 1 -b 16 --fp16 -o --cache -c /content/yolox_s.pth

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

ilmoney avatar May 16 '22 12:05 ilmoney

any news on this? I have AP=0 for all metrics in eval.

masc-it avatar Aug 07 '22 07:08 masc-it

If you change your batchsize, please check your learning rate carefully. Issues of unproducble performance are often caused by wrong learning rate.

FateScript avatar Aug 08 '22 03:08 FateScript

I have faced the exact same issue, also changing to -b 64 @FateScript

ghsanti avatar Mar 03 '23 23:03 ghsanti