PyTorch_YOLOv4
PyTorch_YOLOv4 copied to clipboard
coco2017 training results map only 35.7
I use default parameters, training with command as follows: python train.py --device 4 --batch-size 16 --img 512 512 --data coco.yaml --cfg cfg/yolov4.cfg --weights '' --name yolov4-pacsp
Using CUDA device0 _CudaDeviceProperties(name='Tesla V100-SXM2-32GB', total_memory=32480MB)
Namespace(adam=False, batch_size=16, bucket='', cache_images=False, cfg='cfg/yolov4.cfg', data='./data/coco.yaml', device='4', epochs=300, evolve=False, global_rank=-1, hyp='data/hyp.scratch.yaml', img_size=[512, 512], local_rank=-1, logdir='runs/', multi_scale=False, name='yolov4-pacsp', noautoanchor=False, nosave=False, notest=False, rect=False, resume=False, single_cls=False, sync_bn=False, total_batch_size=16, weights='', world_size=1) Start Tensorboard with "tensorboard --logdir runs/", view at http://localhost:6006/
Hyperparameters {'lr0': 0.01, 'momentum': 0.937, 'weight_decay': 0.0005, 'giou': 0.05, 'cls': 0.5, 'cls_pw': 1.0, 'obj': 1.0, 'obj_pw': 1.0, 'iou_t': 0.2, 'anchor_t': 4.0, 'fl_gamma': 0.0, 'hsv_h': 0.015, 'hsv_s': 0.7, 'hsv_v': 0.4, 'degrees': 0.0, 'translate': 0.0, 'scale': 0.5, 'shear': 0.0, 'perspective': 0.0, 'flipud': 0.0, 'fliplr': 0.5, 'mixup': 0.0} Model Summary: 327 layers, 6.43631e+07 parameters, 6.43631e+07 gradients, 142.8 GFLOPS Optimizer groups: 110 .bias, 110 conv.weight, 107 other
log: 0/299 16.2G 0.07564 0.1414 0.07705 0.2941 176 512 0.08018 0.04417 0.03022 0.01233 0.07018 0.1006 0.06409 1/299 24.5G 0.06105 0.1347 0.05651 0.2523 238 512 0.181 0.1989 0.1242 0.05767 0.06088 0.09591 0.04571 2/299 24.5G 0.05685 0.1298 0.0461 0.2327 258 512 0.2109 0.3024 0.2009 0.1022 0.05672 0.0931 0.03762 3/299 24.5G 0.05361 0.1261 0.04016 0.2199 217 512 0.2296 0.3767 0.2647 0.1398 0.05396 0.0909 0.03271 4/299 24.5G 0.05138 0.1233 0.03632 0.211 225 512 0.2365 0.4161 0.3034 0.1645 0.05229 0.08974 0.0299 5/299 24.5G 0.04991 0.1215 0.03396 0.2054 206 512 0.249 0.4353 0.324 0.1777 0.05133 0.08885 0.02838 6/299 24.5G 0.04893 0.1202 0.03219 0.2013 233 512 0.2605 0.4441 0.3377 0.1867 0.05071 0.08814 0.02744
could you provide results.txt?
by the way, default setting training with --img 640 640
.
Thank you for your reply. This is my training log: results.txt I find the anchor size in yolov4.cfg is [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]. The size is same to Darknet 512, so I set 512
original yolov4 trained with darknet using multi-scale training. new code use jitters, so the resolution setting is changed to 640.
if you could, try to train again in nvcr.io/nvidia/pytorch:20.08-py3
docker environment.
Thanks. I will set image size to 640 and multi-scale on, then train again both in my docker and nvcr.io/nvidia/pytorch:20.08-py3 docker.
you could use your previous training command.
python train.py --device 4 --batch-size 16 --img 512 512 --data coco.yaml --cfg cfg/yolov4.cfg --weights '' --name yolov4
it seems some version of pytorch/cuda/cudnn will produce weird results.
you could use your previous training command.
python train.py --device 4 --batch-size 16 --img 512 512 --data coco.yaml --cfg cfg/yolov4.cfg --weights '' --name yolov4
it seems some version of pytorch/cuda/cudnn will produce weird results.
The docker you provide need nvidia driver>450. I can't do this experiment. Could you provide your results.txt? I want to compare with yours results.txt to find whether could position problem.
I just train with 640 640 for a while.
0/299 12.7G 0.08222 0.08686 0.08028 0.2494 23 640 0.05106 0.01807 0.02296 0.009321 0.06775 0.0715 0.06805
1/299 12.7G 0.06487 0.08444 0.06221 0.2115 7 640 0.2476 0.1556 0.1292 0.06344 0.05693 0.06702 0.0484
2/299 12.7G 0.05953 0.08114 0.04974 0.1904 22 640 0.2728 0.2982 0.2332 0.1259 0.05127 0.06413 0.03756
3/299 12.7G 0.05525 0.07849 0.04187 0.1756 28 640 0.2718 0.4086 0.3207 0.1843 0.04774 0.06168 0.03087
4/299 12.7G 0.05233 0.07645 0.03687 0.1657 3 640 0.2903 0.4619 0.3705 0.2204 0.04568 0.06018 0.02754
5/299 12.7G 0.05058 0.07531 0.03403 0.1599 26 640 0.3089 0.4882 0.4008 0.2433 0.04445 0.05919 0.02568
6/299 12.7G 0.04944 0.07431 0.03213 0.1559 36 640 0.3258 0.5029 0.4183 0.2574 0.04372 0.05857 0.02457
7/299 12.7G 0.04845 0.07338 0.03078 0.1526 17 640 0.3411 0.5111 0.4305 0.2667 0.0432 0.05812 0.02382
8/299 12.7G 0.04783 0.07309 0.02977 0.1507 49 640 0.3536 0.5148 0.4404 0.2748 0.0428 0.05776 0.02325
9/299 12.7G 0.0473 0.07247 0.0289 0.1487 7 640 0.3683 0.5167 0.4481 0.2809 0.04248 0.0575 0.02278
10/299 12.7G 0.04672 0.07187 0.02804 0.1466 16 640 0.3858 0.5174 0.4567 0.2874 0.04219 0.0573 0.02235
11/299 12.7G 0.04633 0.07169 0.02751 0.1455 24 640 0.3982 0.5174 0.4636 0.293 0.04191 0.05716 0.02194
12/299 12.7G 0.0461 0.07153 0.02709 0.1447 17 640 0.4137 0.5181 0.4697 0.2983 0.04165 0.05706 0.02156
13/299 12.7G 0.04576 0.07135 0.02663 0.1437 18 640 0.4257 0.5187 0.4759 0.3035 0.04139 0.05696 0.02119
14/299 12.7G 0.04544 0.07097 0.02609 0.1425 8 640 0.4356 0.5198 0.4828 0.309 0.04112 0.05686 0.0208
15/299 12.7G 0.0452 0.07079 0.02577 0.1418 45 640 0.4444 0.522 0.4901 0.3149 0.04086 0.05675 0.02042
16/299 12.7G 0.04496 0.07044 0.02554 0.1409 19 640 0.453 0.5272 0.4977 0.3205 0.04058 0.05661 0.02006
17/299 12.7G 0.04482 0.07046 0.02525 0.1405 31 640 0.4575 0.5327 0.5041 0.3261 0.04031 0.05644 0.01969
18/299 12.7G 0.04454 0.06976 0.02495 0.1393 12 640 0.4616 0.5389 0.5106 0.3312 0.04004 0.05624 0.01933
19/299 12.7G 0.04443 0.06981 0.02479 0.139 40 640 0.465 0.5468 0.5173 0.3361 0.03977 0.05601 0.019
20/299 12.7G 0.04433 0.06977 0.02473 0.1388 27 640 0.4653 0.5528 0.5233 0.341 0.0395 0.05577 0.01869
Thank you for your reply. Did you set batch-size=8 and multi-scale off ?
Thank you for your reply. Did you set batch-size=8 and multi-scale off ? 您好,我遇到了和您一样的问题,在coco2017上512的img_size精度只有36左右,请问您解决了么~
Thank you for your reply. Did you set batch-size=8 and multi-scale off ? 您好,我遇到了和您一样的问题,在coco2017上512的img_size精度只有36左右,请问您解决了么~
目前没有,正在训练YOLOv4pacsp-x-mish这个config,目前147个epoch map能达到39.8
i set batch-size=16 and multi-scale off in https://github.com/WongKinYiu/PyTorch_YOLOv4/issues/232#issuecomment-759227582
and i try to train yolov4s with 512x512 on a 2080ti, it can reach 32+ AP.
Thank you for your reply. Did you set batch-size=8 and multi-scale off ? 您好,我遇到了和您一样的问题,在coco2017上512的img_size精度只有36左右,请问您解决了么~
目前没有,正在训练YOLOv4pacsp-x-mish这个config,目前147个epoch map能达到39.8
同样的环境和代码,只是替换网络么~
您好, 谢谢您的回答,您有换环境么,对于目前的结果,我感到很疑惑,精度比论文差很多,观察曲线看出,从很早开始精度就以很小的速度在增长了。我们参考的源码在640下的增长曲线是很正常的,只是替换尺寸,感觉这个结果差太多了。
------------------ 原始邮件 ------------------ 发件人: "WongKinYiu/PyTorch_YOLOv4" <[email protected]>; 发送时间: 2021年1月25日(星期一) 晚上8:40 收件人: "WongKinYiu/PyTorch_YOLOv4"<[email protected]>; 抄送: "1208695936"<[email protected]>;"Comment"<[email protected]>; 主题: Re: [WongKinYiu/PyTorch_YOLOv4] coco2017 training results map only 35.7 (#232)
i set batch-size=16 and multi-scale off in #232 (comment)
and i try to train yolov4s with 512x512 on a 2080ti, it can reach 32+ AP.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
您好, 谢谢您的回答,您有换环境么,对于目前的结果,我感到很疑惑,精度比论文差很多,观察曲线看出,从很早开始精度就以很小的速度在增长了。我们参考的源码在640下的增长曲线是很正常的,只是替换尺寸,感觉这个结果差太多了。 … ------------------ 原始邮件 ------------------ 发件人: "WongKinYiu/PyTorch_YOLOv4" <[email protected]>; 发送时间: 2021年1月25日(星期一) 晚上8:40 收件人: "WongKinYiu/PyTorch_YOLOv4"<[email protected]>; 抄送: "1208695936"<[email protected]>;"Comment"<[email protected]>; 主题: Re: [WongKinYiu/PyTorch_YOLOv4] coco2017 training results map only 35.7 (#232) i set batch-size=16 and multi-scale off in #232 (comment) and i try to train yolov4s with 512x512 on a 2080ti, it can reach 32+ AP. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
没有换环境,只是换了map是50.0的pacsp-x-mish config,不确定最终能否训练到50.0
我在147 epoch 640的ap是45.2
我在147 epoch 640的ap是45.2 您好,我在pytorch1.6和1.7.0、1.7.1下分别尝试,img_size为512时map最终只能到36左右,但是您和supeng0924在640下都能训得比较好的结果,请问是什么原因呢?
我在147 epoch 640的ap是45.2 您好,我在pytorch1.6和1.7.0、1.7.1下分别尝试,img_size为512时map最终只能到36左右,但是您和supeng0924在640下都能训得比较好的结果,请问是什么原因呢?
作者之前提供的docker环境是CUDA11.1,而我是用CUDA10.1+pytorch1.6做的训练,不确定是否是这块会产生这么大的影响。而且我目前还没有CUDA11.1的环境,不知道你是否有CUDA11.1的环境做这个验证。
我在147 epoch 640的ap是45.2 您好,我在pytorch1.6和1.7.0、1.7.1下分别尝试,img_size为512时map最终只能到36左右,但是您和supeng0924在640下都能训得比较好的结果,请问是什么原因呢?
作者之前提供的docker环境是CUDA11.1,而我是用CUDA10.1+pytorch1.6做的训练,不确定是否是这块会产生这么大的影响。而且我目前还没有CUDA11.1的环境,不知道你是否有CUDA11.1的环境做这个验证。 您好,您可以提供一份docker环境的list么,我这边已经在cuda11.0+pytoch1.7.1下验证了,目前165轮0.335,而且从趋势看增长很慢,后期差不多也只能到36左右~
我測試過沒問題的幾個環境
nvcr.io/nvidia/pytorch:20.02-py3
nvcr.io/nvidia/pytorch:20.03-py3
nvcr.io/nvidia/pytorch:20.06-py3
nvcr.io/nvidia/pytorch:20.08-py3
您好,您有尝试过训512的么~
512 yolov4-pacsp-s-mish
512 yolov4-pacsp-s-mish
您好,您有yolov4在coco2017上512尺度下的结果么,我基于您的源码训了多次,在不同环境下,精度最后都在36左右,现在比较困惑是环境的问题,还是目前的参数配置不是很适用于512的尺度。
512 yolov4-pacsp-s-mish
您好,您有yolov4在coco2017上512尺度下的结果么,我基于您的源码训了多次,在不同环境下,精度最后都在36左右,现在比较困惑是环境的问题,还是目前的参数配置不是很适用于512的尺度。
您好,我现在遇到跟您之前一样的问题 也是512尺度下map只能到36%,请问您解决了吗?