YOLOv6 icon indicating copy to clipboard operation
YOLOv6 copied to clipboard

caching images to ram generates 0 mAP score

Open buyukkanber opened this issue 2 years ago • 5 comments

Before Asking

  • [X] I have read the README carefully. 我已经仔细阅读了README上的操作指引。

  • [X] I want to train my custom dataset, and I have read the tutorials for training your custom data carefully and organize my dataset correctly; (FYI: We recommand you to apply the config files of xx_finetune.py.) 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。(FYI: 我们推荐使用xx_finetune.py等配置文件训练自定义数据集。)

  • [X] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。

Search before asking

  • [X] I have searched the YOLOv6 issues and found no similar questions.

Question

Hello!

When I try to train my custom dataset I encountered the 0 mAP score results per epoch when I used --cache-ram argparser to accelerate training.

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Results saved to runs/train/exp
Epoch: 7 | [email protected]: 0.0 | [email protected]:0.95: 0.0

And if I don't cache images in RAM, the training per epoch takes 15 minutes on Colab environment which makes almost impractical to execute model training.

Similar problem has been mentioned, but I haven't seen any proper solution that fixes this issue.

Any progress has been taken on this?

Thanks!

Additional

No response

buyukkanber avatar Nov 23 '23 17:11 buyukkanber

有没有可能是因为在第一轮,第一轮为0应该正常

todesti2 avatar Nov 24 '23 07:11 todesti2

能否添加上你的训练指令呢

todesti2 avatar Nov 24 '23 07:11 todesti2

代码有bug吧,我这也是

clw5180 avatar Nov 25 '23 09:11 clw5180

@buyukkanber You can check this: https://github.com/meituan/YOLOv6/pull/948 , it still not get merged yet, you can modify by yourself.

clw5180 avatar Nov 25 '23 09:11 clw5180

with the latest change https://github.com/meituan/YOLOv6/pull/948#issuecomment-1826263820 on the following commit https://github.com/meituan/YOLOv6/pull/948, it seems working. However, there is 1 minute delay between each epoch throughout the training.

buyukkanber avatar Nov 26 '23 07:11 buyukkanber