YOLO-World icon indicating copy to clipboard operation
YOLO-World copied to clipboard

yolo-wolrd-s 在 上objects365 复现失败

Open lluo-Desktop opened this issue 1 year ago • 8 comments

Hi, 我配置环境进行yolo-world-s复现, 1)环境配置验证:可以使用提供的checkpoint在lvis上zeroshot达到和官方一致的mAP结果 2)复现配置:仅仅使用objects365v1的数据和train.json obj365v1_train_dataset = dict( type='MultiModalDataset', dataset=dict( type='YOLOv5Objects365V1Dataset', data_root='/my_path/datasets/objects365v1/', ann_file='/my_path/datasets/objects365v1/annotations/train.json', data_prefix=dict(img='train/'), filter_cfg=dict(filter_empty_gt=False, min_size=32)), class_text_path='data/texts/obj365v1_class_texts.json', pipeline=train_pipeline)

train_dataloader = dict(batch_size=train_batch_size_per_gpu, collate_fn=dict(type='yolow_collate'), dataset=dict(delete=True, type='ConcatDataset', datasets=[ obj365v1_train_dataset ], ignore_keys=['classes', 'palette'])) 3) 超参数:使用pretrain中的参数,使用8V100x16bs 4) 训练过程:loss保持偏高,infer不到正确结果 2024/02/11 07:05:26 - mmengine - INFO - Epoch(train) [100][4300/4755] base_lr: 2.0000e-03 lr: 5.9600e-05 eta: 0:05:38 time: 0.7859 data_time: 0.0060 memory: 8815 grad_norm: 492.3932 loss: 433.7005 loss_cls: 160.5437 loss_bbox: 137.8297 loss_dfl: 135.3271 2024/02/11 07:06:03 - mmengine - INFO - Epoch(train) [100][4350/4755] base_lr: 2.0000e-03 lr: 5.9600e-05 eta: 0:05:01 time: 0.7353 data_time: 0.0059 memory: 9002 grad_norm: 491.6336 loss: 436.9689 loss_cls: 162.9173 loss_bbox: 138.7117 loss_dfl: 135.3400 2024/02/11 07:06:42 - mmengine - INFO - Epoch(train) [100][4400/4755] base_lr: 2.0000e-03 lr: 5.9600e-05 eta: 0:04:23 time: 0.7874 data_time: 0.0062 memory: 8922 grad_norm: 480.4001 loss: 437.0023 loss_cls: 161.6231 loss_bbox: 139.2435 loss_dfl: 136.1358 2024/02/11 07:07:20 - mmengine - INFO - Epoch(train) [100][4450/4755] base_lr: 2.0000e-03 lr: 5.9600e-05 eta: 0:03:46 time: 0.7536 data_time: 0.0060 memory: 8615 grad_norm: 487.9105 loss: 439.4708 loss_cls: 166.0740 loss_bbox: 138.1573 loss_dfl: 135.2395

 请问有任何复现的建议吗?

lluo-Desktop avatar Feb 20 '24 02:02 lluo-Desktop

@lluo-Desktop, 您好,中间有评测结果吗?可以先看看5个epoch的评测结果。另外我们还没有在8x16的setting下(默认采用的32x16)用目前这套setting训练,我们可以帮您验证一下训练。

wondervictor avatar Feb 20 '24 07:02 wondervictor

@wondervictor 感谢你的回复! 这是在objects365上进行pretrain的,load_from官方版本,2epoch的train/val log: 2024/02/20 03:38:33 - mmengine - INFO - Epoch(train) [2][4700/4755] base_lr: 5.0000e-04 lr: 3.2809e-04 eta: 2 days, 23:09:22 time: 0.5640 data_time: 0.0045 memory: 11813 grad_norm: 848.5867 loss: 502.3191 loss_cls: 201.0484 loss_bbox: 155.8162 loss_dfl: 145.4545 2024/02/20 03:39:03 - mmengine - INFO - Epoch(train) [2][4750/4755] base_lr: 5.0000e-04 lr: 3.2983e-04 eta: 2 days, 23:10:32 time: 0.5896 data_time: 0.0044 memory: 8960 grad_norm: 925.2765 loss: 509.8672 loss_cls: 204.1596 loss_bbox: 157.7191 loss_dfl: 147.9884 2024/02/20 03:39:04 - mmengine - INFO - Exp name: exp1.3_yolo_world_s_8nx16bs_obj365v1_20240220_020931 2024/02/20 03:39:04 - mmengine - INFO - Saving checkpoint at 2 epochs 2024/02/20 03:39:08 - mmengine - WARNING - save_param_scheduler is True but self.param_schedulers is None, so skip saving parameter schedulers 2024/02/20 03:39:18 - mmengine - INFO - Epoch(val) [2][ 50/602] eta: 0:01:28 time: 0.1608 data_time: 0.0074 memory: 9160
...
2024/02/20 03:40:33 - mmengine - INFO - Epoch(val) [2][600/602] eta: 0:00:00 time: 0.1382 data_time: 0.0002 memory: 924
2024/02/20 03:40:56 - mmengine - INFO - Evaluating bbox... 2024/02/20 03:42:21 - mmengine - INFO - Epoch(val) [2][602/602] lvis/bbox_AP: 0.0070 lvis/bbox_AP50: 0.0080 lvis/bbox_AP75: 0.0070 lvis/bbox_APs: 0.0020 lvis/bbox_APm: 0.0070 lvis/bbox_APl: 0.0200 lvis/bbox_APr: 0.0000 lvis/bbox_APc: 0.0080 lvis/bbox_APf: 0.0070 data_time: 0.0009 time: 0.1393

今天尝试finetune on coco,loss和val表现是正常的,8V100x16bs,lr=2e-4,,下面是部分train/val log: ... 2024/02/20 07:38:33 - mmengine - INFO - Epoch(train) [2][900/925] base_lr: 2.0000e-04 lr: 1.2983e-04 eta: 9:42:33 time: 0.4756 data_time: 0.0047 memory: 8196 grad_norm: 1069.7471 loss: 428.3111 loss_cls: 150.2849 loss_bbox: 130.5900 loss_dfl: 147.4362 2024/02/20 07:38:45 - mmengine - INFO - Exp name: yolo_world_s_dual_vlpan_2e-4_80e_8gpus_finetune_coco_20240220_072120 2024/02/20 07:38:45 - mmengine - INFO - Saving checkpoint at 2 epochs 2024/02/20 07:38:48 - mmengine - WARNING - save_param_scheduler is True but self.param_schedulers is None, so skip saving parameter schedulers 2024/02/20 07:38:51 - mmengine - INFO - Epoch(val) [2][ 50/625] eta: 0:00:09 time: 0.0164 data_time: 0.0006 memory: 8076
... 2024/02/20 07:38:59 - mmengine - INFO - Epoch(val) [2][600/625] eta: 0:00:00 time: 0.0144 data_time: 0.0002 memory: 850
2024/02/20 07:39:11 - mmengine - INFO - Evaluating bbox... 2024/02/20 07:40:17 - mmengine - INFO - bbox_mAP_copypaste: 0.394 0.548 0.432 0.220 0.427 0.526

lluo-Desktop avatar Feb 20 '24 08:02 lluo-Desktop

稍等,我这边试试这个setting。

wondervictor avatar Feb 20 '24 09:02 wondervictor

稍等,我这边试试这个setting。

请问下你们的training log提供在哪里呢,貌似没有找着,谢谢

taofuyu avatar Feb 21 '24 08:02 taofuyu

@lluo-Desktop 您好,我目前按照objects365v1, 8x16bs的setting训练了一下YOLO-World-S,前期的log如下:

02/22 17:49:39 - mmengine - INFO - Epoch(train)   [1][  50/4755]  lr: 6.8700e-06  eta: 6 days, 20:34:55  time: 1.2462  data_time: 0.5570  memory: 13214  grad_norm: nan  loss: 1776.3508  loss_cls: 733.2614  loss_bbox: 500.5636  loss_dfl: 542.5258
02/22 17:50:15 - mmengine - INFO - Epoch(train)   [1][ 100/4755]  lr: 1.3880e-05  eta: 5 days, 9:40:25  time: 0.7177  data_time: 0.3736  memory: 7009  grad_norm: 1095.3765  loss: 1732.4874  loss_cls: 692.1060  loss_bbox: 502.5964  loss_dfl: 537.7850
02/22 17:50:47 - mmengine - INFO - Epoch(train)   [1][ 150/4755]  lr: 2.0890e-05  eta: 4 days, 19:04:21  time: 0.6505  data_time: 0.2494  memory: 6462  grad_norm: 926.0114  loss: 1685.0875  loss_cls: 646.4953  loss_bbox: 506.1760  loss_dfl: 532.4163
02/22 17:51:17 - mmengine - INFO - Epoch(train)   [1][ 200/4755]  lr: 2.7900e-05  eta: 4 days, 10:01:39  time: 0.5978  data_time: 0.2741  memory: 5902  grad_norm: 731.0350  loss: 1621.9136  loss_cls: 603.8184  loss_bbox: 499.2010  loss_dfl: 518.8943
02/22 17:51:48 - mmengine - INFO - Epoch(train)   [1][ 250/4755]  lr: 3.4911e-05  eta: 4 days, 5:00:20  time: 0.6133  data_time: 0.2042  memory: 7715  grad_norm: 661.7663  loss: 1568.2445  loss_cls: 580.4432  loss_bbox: 487.5858  loss_dfl: 500.2155
02/22 17:52:18 - mmengine - INFO - Epoch(train)   [1][ 300/4755]  lr: 4.1921e-05  eta: 4 days, 1:18:26  time: 0.5975  data_time: 0.1717  memory: 6475  grad_norm: 725.5072  loss: 1511.0823  loss_cls: 564.0430  loss_bbox: 461.9295  loss_dfl: 485.1097
02/22 17:52:42 - mmengine - INFO - Epoch(train)   [1][ 350/4755]  lr: 4.8931e-05  eta: 3 days, 20:33:57  time: 0.4862  data_time: 0.1077  memory: 5902  grad_norm: inf  loss: 1468.3130  loss_cls: 555.9068  loss_bbox: 445.7742  loss_dfl: 466.6320
02/22 17:53:10 - mmengine - INFO - Epoch(train)   [1][ 400/4755]  lr: 5.5941e-05  eta: 3 days, 18:06:37  time: 0.5531  data_time: 0.0768  memory: 6795  grad_norm: 748.6951  loss: 1396.6288  loss_cls: 536.2165  loss_bbox: 415.3218  loss_dfl: 445.0905
02/22 17:53:37 - mmengine - INFO - Epoch(train)   [1][ 450/4755]  lr: 6.2951e-05  eta: 3 days, 16:04:32  time: 0.5447  data_time: 0.0875  memory: 7235  grad_norm: 732.4562  loss: 1345.2445  loss_cls: 530.3352  loss_bbox: 390.0545  loss_dfl: 424.8548
02/22 17:54:04 - mmengine - INFO - Epoch(train)   [1][ 500/4755]  lr: 6.9961e-05  eta: 3 days, 14:24:34  time: 0.5419  data_time: 0.0745  memory: 6329  grad_norm: 713.2585  loss: 1280.7532  loss_cls: 512.5199  loss_bbox: 368.0122  loss_dfl: 400.2211
02/22 17:54:32 - mmengine - INFO - Epoch(train)   [1][ 550/4755]  lr: 7.6972e-05  eta: 3 days, 13:15:44  time: 0.5600  data_time: 0.0562  memory: 7169  grad_norm: 703.7077  loss: 1238.0703  loss_cls: 503.2109  loss_bbox: 352.5556  loss_dfl: 382.3037
02/22 17:54:58 - mmengine - INFO - Epoch(train)   [1][ 600/4755]  lr: 8.3982e-05  eta: 3 days, 11:54:54  time: 0.5245  data_time: 0.0349  memory: 5862  grad_norm: 705.5707  loss: 1206.8551  loss_cls: 496.0337  loss_bbox: 344.0969  loss_dfl: 366.7244
02/22 17:55:25 - mmengine - INFO - Epoch(train)   [1][ 650/4755]  lr: 9.0992e-05  eta: 3 days, 10:51:18  time: 0.5325  data_time: 0.0505  memory: 6382  grad_norm: 681.6300  loss: 1171.1414  loss_cls: 483.5630  loss_bbox: 334.9132  loss_dfl: 352.6652
02/22 17:55:52 - mmengine - INFO - Epoch(train)   [1][ 700/4755]  lr: 9.8002e-05  eta: 3 days, 9:56:12  time: 0.5316  data_time: 0.0846  memory: 5902  grad_norm: 662.0694  loss: 1152.3215  loss_cls: 480.0629  loss_bbox: 329.5538  loss_dfl: 342.7047
02/22 17:56:19 - mmengine - INFO - Epoch(train)   [1][ 750/4755]  lr: 1.0501e-04  eta: 3 days, 9:17:07  time: 0.5481  data_time: 0.1094  memory: 7089  grad_norm: 653.8856  loss: 1124.8583  loss_cls: 474.3178  loss_bbox: 318.0799  loss_dfl: 332.4606
02/22 17:56:48 - mmengine - INFO - Epoch(train)   [1][ 800/4755]  lr: 1.1202e-04  eta: 3 days, 9:02:03  time: 0.5869  data_time: 0.1455  memory: 10582  grad_norm: 628.4977  loss: 1106.5697  loss_cls: 472.1445  loss_bbox: 312.6728  loss_dfl: 321.7523
02/22 17:57:12 - mmengine - INFO - Epoch(train)   [1][ 850/4755]  lr: 1.1903e-04  eta: 3 days, 7:54:21  time: 0.4702  data_time: 0.0982  memory: 5796  grad_norm: 599.6026  loss: 1087.8212  loss_cls: 460.1374  loss_bbox: 311.1525  loss_dfl: 316.5314
02/22 17:57:39 - mmengine - INFO - Epoch(train)   [1][ 900/4755]  lr: 1.2604e-04  eta: 3 days, 7:22:00  time: 0.5336  data_time: 0.0508  memory: 8169  grad_norm: 625.8642  loss: 1066.0954  loss_cls: 453.7859  loss_bbox: 304.1042  loss_dfl: 308.2053
02/22 17:58:11 - mmengine - INFO - Epoch(train)   [1][ 950/4755]  lr: 1.3305e-04  eta: 3 days, 7:40:01  time: 0.6465  data_time: 0.0887  memory: 6195  grad_norm: 581.4653  loss: 1050.8522  loss_cls: 451.0907  loss_bbox: 298.9823  loss_dfl: 300.7792
02/22 17:58:52 - mmengine - INFO - Epoch(train)   [1][1000/4755]  lr: 1.4006e-04  eta: 3 days, 9:06:53  time: 0.8254  data_time: 0.2640  memory: 8595  grad_norm: 564.5235  loss: 1040.2058  loss_cls: 446.7620  loss_bbox: 297.4854  loss_dfl: 295.9584
02/22 17:59:27 - mmengine - INFO - Epoch(train)   [1][1050/4755]  lr: 1.4707e-04  eta: 3 days, 9:36:47  time: 0.6962  data_time: 0.1749  memory: 6235  grad_norm: 564.7915  loss: 1022.5335  loss_cls: 439.8820  loss_bbox: 291.4202  loss_dfl: 291.2313
02/22 17:59:48 - mmengine - INFO - Epoch(train)   [1][1100/4755]  lr: 1.5408e-04  eta: 3 days, 8:26:35  time: 0.4254  data_time: 0.0137  memory: 7769  grad_norm: 530.7317  loss: 1004.7401  loss_cls: 435.0032  loss_bbox: 286.9149  loss_dfl: 282.8220
02/22 18:00:12 - mmengine - INFO - Epoch(train)   [1][1150/4755]  lr: 1.6109e-04  eta: 3 days, 7:37:05  time: 0.4679  data_time: 0.0091  memory: 8102  grad_norm: inf  loss: 995.4325  loss_cls: 431.1275  loss_bbox: 285.3974  loss_dfl: 278.9076
02/22 18:00:33 - mmengine - INFO - Epoch(train)   [1][1200/4755]  lr: 1.6810e-04  eta: 3 days, 6:40:58  time: 0.4354  data_time: 0.0467  memory: 6942  grad_norm: 520.9851  loss: 980.0758  loss_cls: 423.9543  loss_bbox: 280.7031  loss_dfl: 275.4183
02/22 18:00:56 - mmengine - INFO - Epoch(train)   [1][1250/4755]  lr: 1.7511e-04  eta: 3 days, 5:54:36  time: 0.4521  data_time: 0.0422  memory: 7249  grad_norm: 544.1749  loss: 964.0905  loss_cls: 419.3851  loss_bbox: 275.0380  loss_dfl: 269.6675
02/22 18:01:21 - mmengine - INFO - Epoch(train)   [1][1300/4755]  lr: 1.8212e-04  eta: 3 days, 5:25:41  time: 0.4979  data_time: 0.0533  memory: 5716  grad_norm: 511.4465  loss: 968.7069  loss_cls: 424.2675  loss_bbox: 276.3973  loss_dfl: 268.0421
02/22 18:01:40 - mmengine - INFO - Epoch(train)   [1][1350/4755]  lr: 1.8913e-04  eta: 3 days, 4:25:29  time: 0.3838  data_time: 0.0173  memory: 6102  grad_norm: 527.1621  loss: 943.2824  loss_cls: 409.6581  loss_bbox: 270.7041  loss_dfl: 262.9202
02/22 18:02:03 - mmengine - INFO - Epoch(train)   [1][1400/4755]  lr: 1.9614e-04  eta: 3 days, 3:48:41  time: 0.4516  data_time: 0.0082  memory: 5956  grad_norm: 499.9155  loss: 944.2081  loss_cls: 409.8294  loss_bbox: 271.4986  loss_dfl: 262.8802
02/22 18:02:24 - mmengine - INFO - Epoch(train)   [1][1450/4755]  lr: 2.0315e-04  eta: 3 days, 3:08:27  time: 0.4298  data_time: 0.0283  memory: 7302  grad_norm: 519.8153  loss: 930.8750  loss_cls: 408.7763  loss_bbox: 264.6100  loss_dfl: 257.4886
02/22 18:02:47 - mmengine - INFO - Epoch(train)   [1][1500/4755]  lr: 2.1016e-04  eta: 3 days, 2:35:17  time: 0.4465  data_time: 0.0129  memory: 5969  grad_norm: 484.2098  loss: 919.9606  loss_cls: 403.3247  loss_bbox: 262.1625  loss_dfl: 254.4734
02/22 18:03:12 - mmengine - INFO - Epoch(train)   [1][1550/4755]  lr: 2.1717e-04  eta: 3 days, 2:17:49  time: 0.4998  data_time: 0.0118  memory: 6755  grad_norm: 493.1965  loss: 911.3023  loss_cls: 399.2407  loss_bbox: 261.1044  loss_dfl: 250.9571
02/22 18:03:31 - mmengine - INFO - Epoch(train)   [1][1600/4755]  lr: 2.2419e-04  eta: 3 days, 1:32:31  time: 0.3827  data_time: 0.0064  memory: 6022  grad_norm: 492.0642  loss: 899.3547  loss_cls: 392.3960  loss_bbox: 259.0811  loss_dfl: 247.8776
02/22 18:03:53 - mmengine - INFO - Epoch(train)   [1][1650/4755]  lr: 2.3120e-04  eta: 3 days, 1:05:51  time: 0.4492  data_time: 0.0184  memory: 10795  grad_norm: 492.8047  loss: 896.6532  loss_cls: 392.5849  loss_bbox: 258.3923  loss_dfl: 245.6760
02/22 18:04:14 - mmengine - INFO - Epoch(train)   [1][1700/4755]  lr: 2.3821e-04  eta: 3 days, 0:34:38  time: 0.4229  data_time: 0.0707  memory: 7422  grad_norm: 452.7797  loss: 888.4180  loss_cls: 387.3658  loss_bbox: 257.7762  loss_dfl: 243.2760
02/22 18:04:37 - mmengine - INFO - Epoch(train)   [1][1750/4755]  lr: 2.4522e-04  eta: 3 days, 0:13:11  time: 0.4584  data_time: 0.0078  memory: 6289  grad_norm: 471.3941  loss: 875.8123  loss_cls: 383.2677  loss_bbox: 253.1041  loss_dfl: 239.4404
02/22 18:05:02 - mmengine - INFO - Epoch(train)   [1][1800/4755]  lr: 2.5223e-04  eta: 3 days, 0:02:03  time: 0.5002  data_time: 0.0099  memory: 6849  grad_norm: 466.7350  loss: 871.8134  loss_cls: 381.8615  loss_bbox: 250.5506  loss_dfl: 239.4013
02/22 18:05:21 - mmengine - INFO - Epoch(train)   [1][1850/4755]  lr: 2.5924e-04  eta: 2 days, 23:25:26  time: 0.3780  data_time: 0.0386  memory: 5662  grad_norm: 449.1701  loss: 857.2986  loss_cls: 373.9115  loss_bbox: 247.0927  loss_dfl: 236.2944
02/22 18:05:42 - mmengine - INFO - Epoch(train)   [1][1900/4755]  lr: 2.6625e-04  eta: 2 days, 22:58:41  time: 0.4162  data_time: 0.0354  memory: 5942  grad_norm: 452.8211  loss: 854.7589  loss_cls: 372.4334  loss_bbox: 246.2098  loss_dfl: 236.1157

wondervictor avatar Feb 22 '24 10:02 wondervictor

@wondervictor hi,感谢你的反馈,loss变化和我的log也是一致的,需要看看到val的时候结果是否正确。 (如果你那边的复现效果是正常的,可以的话share一下config & log文件,我可以对比差异) 我回溯数据集的问题,发现我使用objectsv1(2019)的标注文件,跟mmdet默认版本使用的有一些category命名的差异(human vs person、大小写问题),理论上几个的category的缺失不会引起结果的完全不对,我刚刚更正了json重新run一下。

lluo-Desktop avatar Feb 23 '24 04:02 lluo-Desktop

@lluo-Desktop @taofuyu 我建议你去我提供的OpenDataLab下载Objects365,之前我处理Objects365也折腾过,目前这个版本是和我们code对齐的。另外我现在已经在这个config下训练了~30epoch,目前看下来结果是正常的,20240222_174524.log 供参考。

wondervictor avatar Feb 23 '24 06:02 wondervictor

@wondervictor 非常感谢你的回复,我重新下载了objects365v1(2019-08-02)的json文件。 正在重新复现,应该是数据集版本的问题(我从自有的数据市场获取19版objects365v1的json在label名上有11个差异命名)。

lluo-Desktop avatar Feb 23 '24 07:02 lluo-Desktop

感谢您开源这么出色的工作。我用YOLO-World-s在objects365v1复现,但是前期的loss跟您的趋势不太一样啊,请问这是正常的吗?我没有V100的卡,我只在两张4090上训练。BS per card 也是16,我没改动其他参数,除了训练数据集改成只用objects365v1和GPU数量为2,我的objects365v1数据集是按照您链接中下载的。谢谢

03/12 17:57:03 - mmengine - INFO - Epoch(train)   [1][   50/19019]  base_lr: 2.0000e-03 lr: 1.7176e-06  eta: 7 days, 21:30:37  time: 0.3587  data_time: 0.0372  memory: 9180  grad_norm: nan  loss: 446.1266  loss_cls: 185.6677  loss_bbox: 123.7541  loss_dfl: 136.7047
03/12 17:57:15 - mmengine - INFO - Epoch(train)   [1][  100/19019]  base_lr: 2.0000e-03 lr: 3.4702e-06  eta: 6 days, 14:30:04  time: 0.2413  data_time: 0.0039  memory: 7734  grad_norm: 565.1302  loss: 445.8083  loss_cls: 184.7159  loss_bbox: 124.9006  loss_dfl: 136.1918
03/12 17:57:28 - mmengine - INFO - Epoch(train)   [1][  150/19019]  base_lr: 2.0000e-03 lr: 5.2228e-06  eta: 6 days, 5:49:49  time: 0.2508  data_time: 0.0038  memory: 6040  grad_norm: 538.9717  loss: 444.6346  loss_cls: 184.0897  loss_bbox: 124.3050  loss_dfl: 136.2399
03/12 17:57:39 - mmengine - INFO - Epoch(train)   [1][  200/19019]  base_lr: 2.0000e-03 lr: 6.9755e-06  eta: 5 days, 23:20:15  time: 0.2345  data_time: 0.0040  memory: 5533  grad_norm: inf  loss: 441.9562  loss_cls: 182.6187  loss_bbox: 123.2948  loss_dfl: 136.0427
03/12 17:57:51 - mmengine - INFO - Epoch(train)   [1][  250/19019]  base_lr: 2.0000e-03 lr: 8.7281e-06  eta: 5 days, 19:42:03  time: 0.2370  data_time: 0.0038  memory: 7546  grad_norm: 506.1350  loss: 438.6803  loss_cls: 180.0693  loss_bbox: 123.0374  loss_dfl: 135.5736
03/12 17:58:04 - mmengine - INFO - Epoch(train)   [1][  300/19019]  base_lr: 2.0000e-03 lr: 1.0481e-05  eta: 5 days, 18:12:13  time: 0.2475  data_time: 0.0039  memory: 5693  grad_norm: 466.7960  loss: 435.8879  loss_cls: 177.5164  loss_bbox: 123.7570  loss_dfl: 134.6145
03/12 17:58:15 - mmengine - INFO - Epoch(train)   [1][  350/19019]  base_lr: 2.0000e-03 lr: 1.2233e-05  eta: 5 days, 16:09:50  time: 0.2347  data_time: 0.0039  memory: 6253  grad_norm: 450.4666  loss: 429.7469  loss_cls: 173.1966  loss_bbox: 123.0853  loss_dfl: 133.4649
03/12 17:58:27 - mmengine - INFO - Epoch(train)   [1][  400/19019]  base_lr: 2.0000e-03 lr: 1.3986e-05  eta: 5 days, 14:47:06  time: 0.2370  data_time: 0.0037  memory: 6053  grad_norm: 460.2511  loss: 422.2496  loss_cls: 168.1797  loss_bbox: 121.7420  loss_dfl: 132.3279
03/12 17:58:39 - mmengine - INFO - Epoch(train)   [1][  450/19019]  base_lr: 2.0000e-03 lr: 1.5739e-05  eta: 5 days, 14:15:45  time: 0.2463  data_time: 0.0040  memory: 6173  grad_norm: 526.7233  loss: 415.7690  loss_cls: 164.0228  loss_bbox: 120.9978  loss_dfl: 130.7484
03/12 17:58:51 - mmengine - INFO - Epoch(train)   [1][  500/19019]  base_lr: 2.0000e-03 lr: 1.7491e-05  eta: 5 days, 13:11:15  time: 0.2339  data_time: 0.0037  memory: 8267  grad_norm: 540.6158  loss: 411.8747  loss_cls: 161.5670  loss_bbox: 120.9805  loss_dfl: 129.3272
03/12 17:59:03 - mmengine - INFO - Epoch(train)   [1][  550/19019]  base_lr: 2.0000e-03 lr: 1.9244e-05  eta: 5 days, 12:17:28  time: 0.2336  data_time: 0.0040  memory: 5800  grad_norm: 532.7258  loss: 404.0686  loss_cls: 158.0140  loss_bbox: 118.3286  loss_dfl: 127.7259
03/12 17:59:15 - mmengine - INFO - Epoch(train)   [1][  600/19019]  base_lr: 2.0000e-03 lr: 2.0997e-05  eta: 5 days, 11:37:55  time: 0.2356  data_time: 0.0039  memory: 6546  grad_norm: 513.6272  loss: 400.2819  loss_cls: 156.8279  loss_bbox: 117.0981  loss_dfl: 126.3559
03/12 17:59:27 - mmengine - INFO - Epoch(train)   [1][  650/19019]  base_lr: 2.0000e-03 lr: 2.2749e-05  eta: 5 days, 11:43:15  time: 0.2515  data_time: 0.0039  memory: 9239  grad_norm: 511.9093  loss: 394.8929  loss_cls: 152.5437  loss_bbox: 117.8386  loss_dfl: 124.5105
03/12 17:59:39 - mmengine - INFO - Epoch(train)   [1][  700/19019]  base_lr: 2.0000e-03 lr: 2.4502e-05  eta: 5 days, 11:16:46  time: 0.2378  data_time: 0.0040  memory: 5706  grad_norm: 513.3235  loss: 388.5590  loss_cls: 151.2647  loss_bbox: 115.0196  loss_dfl: 122.2748
03/12 17:59:51 - mmengine - INFO - Epoch(train)   [1][  750/19019]  base_lr: 2.0000e-03 lr: 2.6254e-05  eta: 5 days, 10:48:51  time: 0.2355  data_time: 0.0041  memory: 7240  grad_norm: 514.5650  loss: 383.7197  loss_cls: 150.8671  loss_bbox: 112.1987  loss_dfl: 120.6540
03/12 18:00:03 - mmengine - INFO - Epoch(train)   [1][  800/19019]  base_lr: 2.0000e-03 lr: 2.8007e-05  eta: 5 days, 10:51:38  time: 0.2492  data_time: 0.0040  memory: 6333  grad_norm: 508.5694  loss: 376.1887  loss_cls: 148.2612  loss_bbox: 109.8921  loss_dfl: 118.0354
03/12 18:00:15 - mmengine - INFO - Epoch(train)   [1][  850/19019]  base_lr: 2.0000e-03 lr: 2.9760e-05  eta: 5 days, 10:38:19  time: 0.2408  data_time: 0.0041  memory: 5920  grad_norm: 480.6808  loss: 369.1250  loss_cls: 145.7235  loss_bbox: 107.7232  loss_dfl: 115.6783
03/12 18:00:27 - mmengine - INFO - Epoch(train)   [1][  900/19019]  base_lr: 2.0000e-03 lr: 3.1512e-05  eta: 5 days, 10:20:42  time: 0.2375  data_time: 0.0038  memory: 5946  grad_norm: 458.0789  loss: 358.9182  loss_cls: 141.4935  loss_bbox: 105.1733  loss_dfl: 112.2514
03/12 18:00:40 - mmengine - INFO - Epoch(train)   [1][  950/19019]  base_lr: 2.0000e-03 lr: 3.3265e-05  eta: 5 days, 10:23:21  time: 0.2485  data_time: 0.0039  memory: 5800  grad_norm: 462.1731  loss: 353.5276  loss_cls: 141.4663  loss_bbox: 102.6783  loss_dfl: 109.3830
03/12 18:00:51 - mmengine - INFO - Exp name: yolo_world_v2_s_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_train_lvis_minival_20240312_175448
03/12 18:00:51 - mmengine - INFO - Epoch(train)   [1][ 1000/19019]  base_lr: 2.0000e-03 lr: 3.5018e-05  eta: 5 days, 10:03:59  time: 0.2348  data_time: 0.0038  memory: 6706  grad_norm: 437.9579  loss: 348.6449  loss_cls: 140.3549  loss_bbox: 101.0544  loss_dfl: 107.2357
03/12 18:01:03 - mmengine - INFO - Epoch(train)   [1][ 1050/19019]  base_lr: 2.0000e-03 lr: 3.6770e-05  eta: 5 days, 9:47:01  time: 0.2352  data_time: 0.0038  memory: 6786  grad_norm: 405.5232  loss: 341.1842  loss_cls: 137.7237  loss_bbox: 99.0177  loss_dfl: 104.4427
03/12 18:01:15 - mmengine - INFO - Epoch(train)   [1][ 1100/19019]  base_lr: 2.0000e-03 lr: 3.8523e-05  eta: 5 days, 9:27:24  time: 0.2323  data_time: 0.0038  memory: 5586  grad_norm: 364.3518  loss: 335.0305  loss_cls: 136.1223  loss_bbox: 96.9399  loss_dfl: 101.9682
03/12 18:01:27 - mmengine - INFO - Epoch(train)   [1][ 1150/19019]  base_lr: 2.0000e-03 lr: 4.0276e-05  eta: 5 days, 9:29:59  time: 0.2472  data_time: 0.0039  memory: 6013  grad_norm: 358.9744  loss: 330.3108  loss_cls: 135.4357  loss_bbox: 95.4105  loss_dfl: 99.4646
03/12 18:01:39 - mmengine - INFO - Epoch(train)   [1][ 1200/19019]  base_lr: 2.0000e-03 lr: 4.2028e-05  eta: 5 days, 9:26:46  time: 0.2430  data_time: 0.0038  memory: 11920  grad_norm: 344.4472  loss: 327.1931  loss_cls: 134.6496  loss_bbox: 94.8774  loss_dfl: 97.6660
03/12 18:01:51 - mmengine - INFO - Epoch(train)   [1][ 1250/19019]  base_lr: 2.0000e-03 lr: 4.3781e-05  eta: 5 days, 9:11:37  time: 0.2334  data_time: 0.0037  memory: 5560  grad_norm: 337.1382  loss: 321.7565  loss_cls: 132.5439  loss_bbox: 92.8661  loss_dfl: 96.3464
03/12 18:02:03 - mmengine - INFO - Epoch(train)   [1][ 1300/19019]  base_lr: 2.0000e-03 lr: 4.5533e-05  eta: 5 days, 9:14:24  time: 0.2472  data_time: 0.0037  memory: 5973  grad_norm: 323.9022  loss: 318.5945  loss_cls: 131.9618  loss_bbox: 91.9790  loss_dfl: 94.6536
03/12 18:02:15 - mmengine - INFO - Epoch(train)   [1][ 1350/19019]  base_lr: 2.0000e-03 lr: 4.7286e-05  eta: 5 days, 9:01:40  time: 0.2341  data_time: 0.0039  memory: 6306  grad_norm: 319.6556  loss: 314.5906  loss_cls: 131.8039  loss_bbox: 89.8786  loss_dfl: 92.9081
03/12 18:02:27 - mmengine - INFO - Epoch(train)   [1][ 1400/19019]  base_lr: 2.0000e-03 lr: 4.9039e-05  eta: 5 days, 8:47:36  time: 0.2322  data_time: 0.0037  memory: 6466  grad_norm: 306.2527  loss: 311.0356  loss_cls: 129.8536  loss_bbox: 89.9395  loss_dfl: 91.2425
03/12 18:02:39 - mmengine - INFO - Epoch(train)   [1][ 1450/19019]  base_lr: 2.0000e-03 lr: 5.0791e-05  eta: 5 days, 8:54:51  time: 0.2508  data_time: 0.0039  memory: 6013  grad_norm: 296.1577  loss: 307.4349  loss_cls: 129.2732  loss_bbox: 88.2924  loss_dfl: 89.8693
03/12 18:02:51 - mmengine - INFO - Epoch(train)   [1][ 1500/19019]  base_lr: 2.0000e-03 lr: 5.2544e-05  eta: 5 days, 8:44:32  time: 0.2346  data_time: 0.0041  memory: 5746  grad_norm: 294.3796  loss: 302.3804  loss_cls: 126.6819  loss_bbox: 87.4314  loss_dfl: 88.2670
03/12 18:03:03 - mmengine - INFO - Epoch(train)   [1][ 1550/19019]  base_lr: 2.0000e-03 lr: 5.4297e-05  eta: 5 days, 8:36:30  time: 0.2362  data_time: 0.0038  memory: 5826  grad_norm: 284.2909  loss: 300.7494  loss_cls: 126.4133  loss_bbox: 87.3834  loss_dfl: 86.9527
03/12 18:03:15 - mmengine - INFO - Epoch(train)   [1][ 1600/19019]  base_lr: 2.0000e-03 lr: 5.6049e-05  eta: 5 days, 8:41:01  time: 0.2484  data_time: 0.0039  memory: 7093  grad_norm: 286.5743  loss: 296.5825  loss_cls: 125.3398  loss_bbox: 85.3759  loss_dfl: 85.8668
03/12 18:03:27 - mmengine - INFO - Epoch(train)   [1][ 1650/19019]  base_lr: 2.0000e-03 lr: 5.7802e-05  eta: 5 days, 8:35:23  time: 0.2381  data_time: 0.0040  memory: 5573  grad_norm: 272.7181  loss: 293.8952  loss_cls: 125.2692  loss_bbox: 83.8527  loss_dfl: 84.7733
03/12 18:03:39 - mmengine - INFO - Epoch(train)   [1][ 1700/19019]  base_lr: 2.0000e-03 lr: 5.9554e-05  eta: 5 days, 8:27:36  time: 0.2355  data_time: 0.0038  memory: 6200  grad_norm: 260.2163  loss: 292.4793  loss_cls: 124.4377  loss_bbox: 84.3622  loss_dfl: 83.6794
03/12 18:03:51 - mmengine - INFO - Epoch(train)   [1][ 1750/19019]  base_lr: 2.0000e-03 lr: 6.1307e-05  eta: 5 days, 8:21:50  time: 0.2372  data_time: 0.0040  memory: 7253  grad_norm: 272.3926  loss: 289.9523  loss_cls: 123.5043  loss_bbox: 83.4889  loss_dfl: 82.9591
03/12 18:04:03 - mmengine - INFO - Epoch(train)   [1][ 1800/19019]  base_lr: 2.0000e-03 lr: 6.3060e-05  eta: 5 days, 8:24:34  time: 0.2465  data_time: 0.0037  memory: 6826  grad_norm: 268.1772  loss: 285.6544  loss_cls: 120.5544  loss_bbox: 82.8305  loss_dfl: 82.2695

JiayuanWang-JW avatar Mar 12 '24 22:03 JiayuanWang-JW

感谢您开源这么出色的工作。我用YOLO-World-s在objects365v1复现,但是前期的loss跟您的趋势不太一样啊,请问这是正常的吗?我没有V100的卡,我只在两张4090上训练。BS per card 也是16,我没改动其他参数,除了训练数据集改成只用objects365v1和GPU数量为2,我的objects365v1数据集是按照您链接中下载的。谢谢

03/12 17:57:03 - mmengine - INFO - Epoch(train)   [1][   50/19019]  base_lr: 2.0000e-03 lr: 1.7176e-06  eta: 7 days, 21:30:37  time: 0.3587  data_time: 0.0372  memory: 9180  grad_norm: nan  loss: 446.1266  loss_cls: 185.6677  loss_bbox: 123.7541  loss_dfl: 136.7047
03/12 17:57:15 - mmengine - INFO - Epoch(train)   [1][  100/19019]  base_lr: 2.0000e-03 lr: 3.4702e-06  eta: 6 days, 14:30:04  time: 0.2413  data_time: 0.0039  memory: 7734  grad_norm: 565.1302  loss: 445.8083  loss_cls: 184.7159  loss_bbox: 124.9006  loss_dfl: 136.1918
03/12 17:57:28 - mmengine - INFO - Epoch(train)   [1][  150/19019]  base_lr: 2.0000e-03 lr: 5.2228e-06  eta: 6 days, 5:49:49  time: 0.2508  data_time: 0.0038  memory: 6040  grad_norm: 538.9717  loss: 444.6346  loss_cls: 184.0897  loss_bbox: 124.3050  loss_dfl: 136.2399
03/12 17:57:39 - mmengine - INFO - Epoch(train)   [1][  200/19019]  base_lr: 2.0000e-03 lr: 6.9755e-06  eta: 5 days, 23:20:15  time: 0.2345  data_time: 0.0040  memory: 5533  grad_norm: inf  loss: 441.9562  loss_cls: 182.6187  loss_bbox: 123.2948  loss_dfl: 136.0427
03/12 17:57:51 - mmengine - INFO - Epoch(train)   [1][  250/19019]  base_lr: 2.0000e-03 lr: 8.7281e-06  eta: 5 days, 19:42:03  time: 0.2370  data_time: 0.0038  memory: 7546  grad_norm: 506.1350  loss: 438.6803  loss_cls: 180.0693  loss_bbox: 123.0374  loss_dfl: 135.5736
03/12 17:58:04 - mmengine - INFO - Epoch(train)   [1][  300/19019]  base_lr: 2.0000e-03 lr: 1.0481e-05  eta: 5 days, 18:12:13  time: 0.2475  data_time: 0.0039  memory: 5693  grad_norm: 466.7960  loss: 435.8879  loss_cls: 177.5164  loss_bbox: 123.7570  loss_dfl: 134.6145
03/12 17:58:15 - mmengine - INFO - Epoch(train)   [1][  350/19019]  base_lr: 2.0000e-03 lr: 1.2233e-05  eta: 5 days, 16:09:50  time: 0.2347  data_time: 0.0039  memory: 6253  grad_norm: 450.4666  loss: 429.7469  loss_cls: 173.1966  loss_bbox: 123.0853  loss_dfl: 133.4649
03/12 17:58:27 - mmengine - INFO - Epoch(train)   [1][  400/19019]  base_lr: 2.0000e-03 lr: 1.3986e-05  eta: 5 days, 14:47:06  time: 0.2370  data_time: 0.0037  memory: 6053  grad_norm: 460.2511  loss: 422.2496  loss_cls: 168.1797  loss_bbox: 121.7420  loss_dfl: 132.3279
03/12 17:58:39 - mmengine - INFO - Epoch(train)   [1][  450/19019]  base_lr: 2.0000e-03 lr: 1.5739e-05  eta: 5 days, 14:15:45  time: 0.2463  data_time: 0.0040  memory: 6173  grad_norm: 526.7233  loss: 415.7690  loss_cls: 164.0228  loss_bbox: 120.9978  loss_dfl: 130.7484
03/12 17:58:51 - mmengine - INFO - Epoch(train)   [1][  500/19019]  base_lr: 2.0000e-03 lr: 1.7491e-05  eta: 5 days, 13:11:15  time: 0.2339  data_time: 0.0037  memory: 8267  grad_norm: 540.6158  loss: 411.8747  loss_cls: 161.5670  loss_bbox: 120.9805  loss_dfl: 129.3272
03/12 17:59:03 - mmengine - INFO - Epoch(train)   [1][  550/19019]  base_lr: 2.0000e-03 lr: 1.9244e-05  eta: 5 days, 12:17:28  time: 0.2336  data_time: 0.0040  memory: 5800  grad_norm: 532.7258  loss: 404.0686  loss_cls: 158.0140  loss_bbox: 118.3286  loss_dfl: 127.7259
03/12 17:59:15 - mmengine - INFO - Epoch(train)   [1][  600/19019]  base_lr: 2.0000e-03 lr: 2.0997e-05  eta: 5 days, 11:37:55  time: 0.2356  data_time: 0.0039  memory: 6546  grad_norm: 513.6272  loss: 400.2819  loss_cls: 156.8279  loss_bbox: 117.0981  loss_dfl: 126.3559
03/12 17:59:27 - mmengine - INFO - Epoch(train)   [1][  650/19019]  base_lr: 2.0000e-03 lr: 2.2749e-05  eta: 5 days, 11:43:15  time: 0.2515  data_time: 0.0039  memory: 9239  grad_norm: 511.9093  loss: 394.8929  loss_cls: 152.5437  loss_bbox: 117.8386  loss_dfl: 124.5105
03/12 17:59:39 - mmengine - INFO - Epoch(train)   [1][  700/19019]  base_lr: 2.0000e-03 lr: 2.4502e-05  eta: 5 days, 11:16:46  time: 0.2378  data_time: 0.0040  memory: 5706  grad_norm: 513.3235  loss: 388.5590  loss_cls: 151.2647  loss_bbox: 115.0196  loss_dfl: 122.2748
03/12 17:59:51 - mmengine - INFO - Epoch(train)   [1][  750/19019]  base_lr: 2.0000e-03 lr: 2.6254e-05  eta: 5 days, 10:48:51  time: 0.2355  data_time: 0.0041  memory: 7240  grad_norm: 514.5650  loss: 383.7197  loss_cls: 150.8671  loss_bbox: 112.1987  loss_dfl: 120.6540
03/12 18:00:03 - mmengine - INFO - Epoch(train)   [1][  800/19019]  base_lr: 2.0000e-03 lr: 2.8007e-05  eta: 5 days, 10:51:38  time: 0.2492  data_time: 0.0040  memory: 6333  grad_norm: 508.5694  loss: 376.1887  loss_cls: 148.2612  loss_bbox: 109.8921  loss_dfl: 118.0354
03/12 18:00:15 - mmengine - INFO - Epoch(train)   [1][  850/19019]  base_lr: 2.0000e-03 lr: 2.9760e-05  eta: 5 days, 10:38:19  time: 0.2408  data_time: 0.0041  memory: 5920  grad_norm: 480.6808  loss: 369.1250  loss_cls: 145.7235  loss_bbox: 107.7232  loss_dfl: 115.6783
03/12 18:00:27 - mmengine - INFO - Epoch(train)   [1][  900/19019]  base_lr: 2.0000e-03 lr: 3.1512e-05  eta: 5 days, 10:20:42  time: 0.2375  data_time: 0.0038  memory: 5946  grad_norm: 458.0789  loss: 358.9182  loss_cls: 141.4935  loss_bbox: 105.1733  loss_dfl: 112.2514
03/12 18:00:40 - mmengine - INFO - Epoch(train)   [1][  950/19019]  base_lr: 2.0000e-03 lr: 3.3265e-05  eta: 5 days, 10:23:21  time: 0.2485  data_time: 0.0039  memory: 5800  grad_norm: 462.1731  loss: 353.5276  loss_cls: 141.4663  loss_bbox: 102.6783  loss_dfl: 109.3830
03/12 18:00:51 - mmengine - INFO - Exp name: yolo_world_v2_s_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_train_lvis_minival_20240312_175448
03/12 18:00:51 - mmengine - INFO - Epoch(train)   [1][ 1000/19019]  base_lr: 2.0000e-03 lr: 3.5018e-05  eta: 5 days, 10:03:59  time: 0.2348  data_time: 0.0038  memory: 6706  grad_norm: 437.9579  loss: 348.6449  loss_cls: 140.3549  loss_bbox: 101.0544  loss_dfl: 107.2357
03/12 18:01:03 - mmengine - INFO - Epoch(train)   [1][ 1050/19019]  base_lr: 2.0000e-03 lr: 3.6770e-05  eta: 5 days, 9:47:01  time: 0.2352  data_time: 0.0038  memory: 6786  grad_norm: 405.5232  loss: 341.1842  loss_cls: 137.7237  loss_bbox: 99.0177  loss_dfl: 104.4427
03/12 18:01:15 - mmengine - INFO - Epoch(train)   [1][ 1100/19019]  base_lr: 2.0000e-03 lr: 3.8523e-05  eta: 5 days, 9:27:24  time: 0.2323  data_time: 0.0038  memory: 5586  grad_norm: 364.3518  loss: 335.0305  loss_cls: 136.1223  loss_bbox: 96.9399  loss_dfl: 101.9682
03/12 18:01:27 - mmengine - INFO - Epoch(train)   [1][ 1150/19019]  base_lr: 2.0000e-03 lr: 4.0276e-05  eta: 5 days, 9:29:59  time: 0.2472  data_time: 0.0039  memory: 6013  grad_norm: 358.9744  loss: 330.3108  loss_cls: 135.4357  loss_bbox: 95.4105  loss_dfl: 99.4646
03/12 18:01:39 - mmengine - INFO - Epoch(train)   [1][ 1200/19019]  base_lr: 2.0000e-03 lr: 4.2028e-05  eta: 5 days, 9:26:46  time: 0.2430  data_time: 0.0038  memory: 11920  grad_norm: 344.4472  loss: 327.1931  loss_cls: 134.6496  loss_bbox: 94.8774  loss_dfl: 97.6660
03/12 18:01:51 - mmengine - INFO - Epoch(train)   [1][ 1250/19019]  base_lr: 2.0000e-03 lr: 4.3781e-05  eta: 5 days, 9:11:37  time: 0.2334  data_time: 0.0037  memory: 5560  grad_norm: 337.1382  loss: 321.7565  loss_cls: 132.5439  loss_bbox: 92.8661  loss_dfl: 96.3464
03/12 18:02:03 - mmengine - INFO - Epoch(train)   [1][ 1300/19019]  base_lr: 2.0000e-03 lr: 4.5533e-05  eta: 5 days, 9:14:24  time: 0.2472  data_time: 0.0037  memory: 5973  grad_norm: 323.9022  loss: 318.5945  loss_cls: 131.9618  loss_bbox: 91.9790  loss_dfl: 94.6536
03/12 18:02:15 - mmengine - INFO - Epoch(train)   [1][ 1350/19019]  base_lr: 2.0000e-03 lr: 4.7286e-05  eta: 5 days, 9:01:40  time: 0.2341  data_time: 0.0039  memory: 6306  grad_norm: 319.6556  loss: 314.5906  loss_cls: 131.8039  loss_bbox: 89.8786  loss_dfl: 92.9081
03/12 18:02:27 - mmengine - INFO - Epoch(train)   [1][ 1400/19019]  base_lr: 2.0000e-03 lr: 4.9039e-05  eta: 5 days, 8:47:36  time: 0.2322  data_time: 0.0037  memory: 6466  grad_norm: 306.2527  loss: 311.0356  loss_cls: 129.8536  loss_bbox: 89.9395  loss_dfl: 91.2425
03/12 18:02:39 - mmengine - INFO - Epoch(train)   [1][ 1450/19019]  base_lr: 2.0000e-03 lr: 5.0791e-05  eta: 5 days, 8:54:51  time: 0.2508  data_time: 0.0039  memory: 6013  grad_norm: 296.1577  loss: 307.4349  loss_cls: 129.2732  loss_bbox: 88.2924  loss_dfl: 89.8693
03/12 18:02:51 - mmengine - INFO - Epoch(train)   [1][ 1500/19019]  base_lr: 2.0000e-03 lr: 5.2544e-05  eta: 5 days, 8:44:32  time: 0.2346  data_time: 0.0041  memory: 5746  grad_norm: 294.3796  loss: 302.3804  loss_cls: 126.6819  loss_bbox: 87.4314  loss_dfl: 88.2670
03/12 18:03:03 - mmengine - INFO - Epoch(train)   [1][ 1550/19019]  base_lr: 2.0000e-03 lr: 5.4297e-05  eta: 5 days, 8:36:30  time: 0.2362  data_time: 0.0038  memory: 5826  grad_norm: 284.2909  loss: 300.7494  loss_cls: 126.4133  loss_bbox: 87.3834  loss_dfl: 86.9527
03/12 18:03:15 - mmengine - INFO - Epoch(train)   [1][ 1600/19019]  base_lr: 2.0000e-03 lr: 5.6049e-05  eta: 5 days, 8:41:01  time: 0.2484  data_time: 0.0039  memory: 7093  grad_norm: 286.5743  loss: 296.5825  loss_cls: 125.3398  loss_bbox: 85.3759  loss_dfl: 85.8668
03/12 18:03:27 - mmengine - INFO - Epoch(train)   [1][ 1650/19019]  base_lr: 2.0000e-03 lr: 5.7802e-05  eta: 5 days, 8:35:23  time: 0.2381  data_time: 0.0040  memory: 5573  grad_norm: 272.7181  loss: 293.8952  loss_cls: 125.2692  loss_bbox: 83.8527  loss_dfl: 84.7733
03/12 18:03:39 - mmengine - INFO - Epoch(train)   [1][ 1700/19019]  base_lr: 2.0000e-03 lr: 5.9554e-05  eta: 5 days, 8:27:36  time: 0.2355  data_time: 0.0038  memory: 6200  grad_norm: 260.2163  loss: 292.4793  loss_cls: 124.4377  loss_bbox: 84.3622  loss_dfl: 83.6794
03/12 18:03:51 - mmengine - INFO - Epoch(train)   [1][ 1750/19019]  base_lr: 2.0000e-03 lr: 6.1307e-05  eta: 5 days, 8:21:50  time: 0.2372  data_time: 0.0040  memory: 7253  grad_norm: 272.3926  loss: 289.9523  loss_cls: 123.5043  loss_bbox: 83.4889  loss_dfl: 82.9591
03/12 18:04:03 - mmengine - INFO - Epoch(train)   [1][ 1800/19019]  base_lr: 2.0000e-03 lr: 6.3060e-05  eta: 5 days, 8:24:34  time: 0.2465  data_time: 0.0037  memory: 6826  grad_norm: 268.1772  loss: 285.6544  loss_cls: 120.5544  loss_bbox: 82.8305  loss_dfl: 82.2695

总的batch size变了,lr也得跟着变呀

taofuyu avatar Mar 13 '24 01:03 taofuyu

感谢您开源这么出色的工作。我用YOLO-World-s在objects365v1复现,但是前期的loss跟您的趋势不太一样啊,请问这是正常的吗?我没有V100的卡,我只在两张4090上训练。BS per card 也是16,我没改动其他参数,除了训练数据集改成只用objects365v1和GPU数量为2,我的objects365v1数据集是按照您链接中下载的。谢谢

03/12 17:57:03 - mmengine - INFO - Epoch(train)   [1][   50/19019]  base_lr: 2.0000e-03 lr: 1.7176e-06  eta: 7 days, 21:30:37  time: 0.3587  data_time: 0.0372  memory: 9180  grad_norm: nan  loss: 446.1266  loss_cls: 185.6677  loss_bbox: 123.7541  loss_dfl: 136.7047
03/12 17:57:15 - mmengine - INFO - Epoch(train)   [1][  100/19019]  base_lr: 2.0000e-03 lr: 3.4702e-06  eta: 6 days, 14:30:04  time: 0.2413  data_time: 0.0039  memory: 7734  grad_norm: 565.1302  loss: 445.8083  loss_cls: 184.7159  loss_bbox: 124.9006  loss_dfl: 136.1918
03/12 17:57:28 - mmengine - INFO - Epoch(train)   [1][  150/19019]  base_lr: 2.0000e-03 lr: 5.2228e-06  eta: 6 days, 5:49:49  time: 0.2508  data_time: 0.0038  memory: 6040  grad_norm: 538.9717  loss: 444.6346  loss_cls: 184.0897  loss_bbox: 124.3050  loss_dfl: 136.2399
03/12 17:57:39 - mmengine - INFO - Epoch(train)   [1][  200/19019]  base_lr: 2.0000e-03 lr: 6.9755e-06  eta: 5 days, 23:20:15  time: 0.2345  data_time: 0.0040  memory: 5533  grad_norm: inf  loss: 441.9562  loss_cls: 182.6187  loss_bbox: 123.2948  loss_dfl: 136.0427
03/12 17:57:51 - mmengine - INFO - Epoch(train)   [1][  250/19019]  base_lr: 2.0000e-03 lr: 8.7281e-06  eta: 5 days, 19:42:03  time: 0.2370  data_time: 0.0038  memory: 7546  grad_norm: 506.1350  loss: 438.6803  loss_cls: 180.0693  loss_bbox: 123.0374  loss_dfl: 135.5736
03/12 17:58:04 - mmengine - INFO - Epoch(train)   [1][  300/19019]  base_lr: 2.0000e-03 lr: 1.0481e-05  eta: 5 days, 18:12:13  time: 0.2475  data_time: 0.0039  memory: 5693  grad_norm: 466.7960  loss: 435.8879  loss_cls: 177.5164  loss_bbox: 123.7570  loss_dfl: 134.6145
03/12 17:58:15 - mmengine - INFO - Epoch(train)   [1][  350/19019]  base_lr: 2.0000e-03 lr: 1.2233e-05  eta: 5 days, 16:09:50  time: 0.2347  data_time: 0.0039  memory: 6253  grad_norm: 450.4666  loss: 429.7469  loss_cls: 173.1966  loss_bbox: 123.0853  loss_dfl: 133.4649
03/12 17:58:27 - mmengine - INFO - Epoch(train)   [1][  400/19019]  base_lr: 2.0000e-03 lr: 1.3986e-05  eta: 5 days, 14:47:06  time: 0.2370  data_time: 0.0037  memory: 6053  grad_norm: 460.2511  loss: 422.2496  loss_cls: 168.1797  loss_bbox: 121.7420  loss_dfl: 132.3279
03/12 17:58:39 - mmengine - INFO - Epoch(train)   [1][  450/19019]  base_lr: 2.0000e-03 lr: 1.5739e-05  eta: 5 days, 14:15:45  time: 0.2463  data_time: 0.0040  memory: 6173  grad_norm: 526.7233  loss: 415.7690  loss_cls: 164.0228  loss_bbox: 120.9978  loss_dfl: 130.7484
03/12 17:58:51 - mmengine - INFO - Epoch(train)   [1][  500/19019]  base_lr: 2.0000e-03 lr: 1.7491e-05  eta: 5 days, 13:11:15  time: 0.2339  data_time: 0.0037  memory: 8267  grad_norm: 540.6158  loss: 411.8747  loss_cls: 161.5670  loss_bbox: 120.9805  loss_dfl: 129.3272
03/12 17:59:03 - mmengine - INFO - Epoch(train)   [1][  550/19019]  base_lr: 2.0000e-03 lr: 1.9244e-05  eta: 5 days, 12:17:28  time: 0.2336  data_time: 0.0040  memory: 5800  grad_norm: 532.7258  loss: 404.0686  loss_cls: 158.0140  loss_bbox: 118.3286  loss_dfl: 127.7259
03/12 17:59:15 - mmengine - INFO - Epoch(train)   [1][  600/19019]  base_lr: 2.0000e-03 lr: 2.0997e-05  eta: 5 days, 11:37:55  time: 0.2356  data_time: 0.0039  memory: 6546  grad_norm: 513.6272  loss: 400.2819  loss_cls: 156.8279  loss_bbox: 117.0981  loss_dfl: 126.3559
03/12 17:59:27 - mmengine - INFO - Epoch(train)   [1][  650/19019]  base_lr: 2.0000e-03 lr: 2.2749e-05  eta: 5 days, 11:43:15  time: 0.2515  data_time: 0.0039  memory: 9239  grad_norm: 511.9093  loss: 394.8929  loss_cls: 152.5437  loss_bbox: 117.8386  loss_dfl: 124.5105
03/12 17:59:39 - mmengine - INFO - Epoch(train)   [1][  700/19019]  base_lr: 2.0000e-03 lr: 2.4502e-05  eta: 5 days, 11:16:46  time: 0.2378  data_time: 0.0040  memory: 5706  grad_norm: 513.3235  loss: 388.5590  loss_cls: 151.2647  loss_bbox: 115.0196  loss_dfl: 122.2748
03/12 17:59:51 - mmengine - INFO - Epoch(train)   [1][  750/19019]  base_lr: 2.0000e-03 lr: 2.6254e-05  eta: 5 days, 10:48:51  time: 0.2355  data_time: 0.0041  memory: 7240  grad_norm: 514.5650  loss: 383.7197  loss_cls: 150.8671  loss_bbox: 112.1987  loss_dfl: 120.6540
03/12 18:00:03 - mmengine - INFO - Epoch(train)   [1][  800/19019]  base_lr: 2.0000e-03 lr: 2.8007e-05  eta: 5 days, 10:51:38  time: 0.2492  data_time: 0.0040  memory: 6333  grad_norm: 508.5694  loss: 376.1887  loss_cls: 148.2612  loss_bbox: 109.8921  loss_dfl: 118.0354
03/12 18:00:15 - mmengine - INFO - Epoch(train)   [1][  850/19019]  base_lr: 2.0000e-03 lr: 2.9760e-05  eta: 5 days, 10:38:19  time: 0.2408  data_time: 0.0041  memory: 5920  grad_norm: 480.6808  loss: 369.1250  loss_cls: 145.7235  loss_bbox: 107.7232  loss_dfl: 115.6783
03/12 18:00:27 - mmengine - INFO - Epoch(train)   [1][  900/19019]  base_lr: 2.0000e-03 lr: 3.1512e-05  eta: 5 days, 10:20:42  time: 0.2375  data_time: 0.0038  memory: 5946  grad_norm: 458.0789  loss: 358.9182  loss_cls: 141.4935  loss_bbox: 105.1733  loss_dfl: 112.2514
03/12 18:00:40 - mmengine - INFO - Epoch(train)   [1][  950/19019]  base_lr: 2.0000e-03 lr: 3.3265e-05  eta: 5 days, 10:23:21  time: 0.2485  data_time: 0.0039  memory: 5800  grad_norm: 462.1731  loss: 353.5276  loss_cls: 141.4663  loss_bbox: 102.6783  loss_dfl: 109.3830
03/12 18:00:51 - mmengine - INFO - Exp name: yolo_world_v2_s_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_train_lvis_minival_20240312_175448
03/12 18:00:51 - mmengine - INFO - Epoch(train)   [1][ 1000/19019]  base_lr: 2.0000e-03 lr: 3.5018e-05  eta: 5 days, 10:03:59  time: 0.2348  data_time: 0.0038  memory: 6706  grad_norm: 437.9579  loss: 348.6449  loss_cls: 140.3549  loss_bbox: 101.0544  loss_dfl: 107.2357
03/12 18:01:03 - mmengine - INFO - Epoch(train)   [1][ 1050/19019]  base_lr: 2.0000e-03 lr: 3.6770e-05  eta: 5 days, 9:47:01  time: 0.2352  data_time: 0.0038  memory: 6786  grad_norm: 405.5232  loss: 341.1842  loss_cls: 137.7237  loss_bbox: 99.0177  loss_dfl: 104.4427
03/12 18:01:15 - mmengine - INFO - Epoch(train)   [1][ 1100/19019]  base_lr: 2.0000e-03 lr: 3.8523e-05  eta: 5 days, 9:27:24  time: 0.2323  data_time: 0.0038  memory: 5586  grad_norm: 364.3518  loss: 335.0305  loss_cls: 136.1223  loss_bbox: 96.9399  loss_dfl: 101.9682
03/12 18:01:27 - mmengine - INFO - Epoch(train)   [1][ 1150/19019]  base_lr: 2.0000e-03 lr: 4.0276e-05  eta: 5 days, 9:29:59  time: 0.2472  data_time: 0.0039  memory: 6013  grad_norm: 358.9744  loss: 330.3108  loss_cls: 135.4357  loss_bbox: 95.4105  loss_dfl: 99.4646
03/12 18:01:39 - mmengine - INFO - Epoch(train)   [1][ 1200/19019]  base_lr: 2.0000e-03 lr: 4.2028e-05  eta: 5 days, 9:26:46  time: 0.2430  data_time: 0.0038  memory: 11920  grad_norm: 344.4472  loss: 327.1931  loss_cls: 134.6496  loss_bbox: 94.8774  loss_dfl: 97.6660
03/12 18:01:51 - mmengine - INFO - Epoch(train)   [1][ 1250/19019]  base_lr: 2.0000e-03 lr: 4.3781e-05  eta: 5 days, 9:11:37  time: 0.2334  data_time: 0.0037  memory: 5560  grad_norm: 337.1382  loss: 321.7565  loss_cls: 132.5439  loss_bbox: 92.8661  loss_dfl: 96.3464
03/12 18:02:03 - mmengine - INFO - Epoch(train)   [1][ 1300/19019]  base_lr: 2.0000e-03 lr: 4.5533e-05  eta: 5 days, 9:14:24  time: 0.2472  data_time: 0.0037  memory: 5973  grad_norm: 323.9022  loss: 318.5945  loss_cls: 131.9618  loss_bbox: 91.9790  loss_dfl: 94.6536
03/12 18:02:15 - mmengine - INFO - Epoch(train)   [1][ 1350/19019]  base_lr: 2.0000e-03 lr: 4.7286e-05  eta: 5 days, 9:01:40  time: 0.2341  data_time: 0.0039  memory: 6306  grad_norm: 319.6556  loss: 314.5906  loss_cls: 131.8039  loss_bbox: 89.8786  loss_dfl: 92.9081
03/12 18:02:27 - mmengine - INFO - Epoch(train)   [1][ 1400/19019]  base_lr: 2.0000e-03 lr: 4.9039e-05  eta: 5 days, 8:47:36  time: 0.2322  data_time: 0.0037  memory: 6466  grad_norm: 306.2527  loss: 311.0356  loss_cls: 129.8536  loss_bbox: 89.9395  loss_dfl: 91.2425
03/12 18:02:39 - mmengine - INFO - Epoch(train)   [1][ 1450/19019]  base_lr: 2.0000e-03 lr: 5.0791e-05  eta: 5 days, 8:54:51  time: 0.2508  data_time: 0.0039  memory: 6013  grad_norm: 296.1577  loss: 307.4349  loss_cls: 129.2732  loss_bbox: 88.2924  loss_dfl: 89.8693
03/12 18:02:51 - mmengine - INFO - Epoch(train)   [1][ 1500/19019]  base_lr: 2.0000e-03 lr: 5.2544e-05  eta: 5 days, 8:44:32  time: 0.2346  data_time: 0.0041  memory: 5746  grad_norm: 294.3796  loss: 302.3804  loss_cls: 126.6819  loss_bbox: 87.4314  loss_dfl: 88.2670
03/12 18:03:03 - mmengine - INFO - Epoch(train)   [1][ 1550/19019]  base_lr: 2.0000e-03 lr: 5.4297e-05  eta: 5 days, 8:36:30  time: 0.2362  data_time: 0.0038  memory: 5826  grad_norm: 284.2909  loss: 300.7494  loss_cls: 126.4133  loss_bbox: 87.3834  loss_dfl: 86.9527
03/12 18:03:15 - mmengine - INFO - Epoch(train)   [1][ 1600/19019]  base_lr: 2.0000e-03 lr: 5.6049e-05  eta: 5 days, 8:41:01  time: 0.2484  data_time: 0.0039  memory: 7093  grad_norm: 286.5743  loss: 296.5825  loss_cls: 125.3398  loss_bbox: 85.3759  loss_dfl: 85.8668
03/12 18:03:27 - mmengine - INFO - Epoch(train)   [1][ 1650/19019]  base_lr: 2.0000e-03 lr: 5.7802e-05  eta: 5 days, 8:35:23  time: 0.2381  data_time: 0.0040  memory: 5573  grad_norm: 272.7181  loss: 293.8952  loss_cls: 125.2692  loss_bbox: 83.8527  loss_dfl: 84.7733
03/12 18:03:39 - mmengine - INFO - Epoch(train)   [1][ 1700/19019]  base_lr: 2.0000e-03 lr: 5.9554e-05  eta: 5 days, 8:27:36  time: 0.2355  data_time: 0.0038  memory: 6200  grad_norm: 260.2163  loss: 292.4793  loss_cls: 124.4377  loss_bbox: 84.3622  loss_dfl: 83.6794
03/12 18:03:51 - mmengine - INFO - Epoch(train)   [1][ 1750/19019]  base_lr: 2.0000e-03 lr: 6.1307e-05  eta: 5 days, 8:21:50  time: 0.2372  data_time: 0.0040  memory: 7253  grad_norm: 272.3926  loss: 289.9523  loss_cls: 123.5043  loss_bbox: 83.4889  loss_dfl: 82.9591
03/12 18:04:03 - mmengine - INFO - Epoch(train)   [1][ 1800/19019]  base_lr: 2.0000e-03 lr: 6.3060e-05  eta: 5 days, 8:24:34  time: 0.2465  data_time: 0.0037  memory: 6826  grad_norm: 268.1772  loss: 285.6544  loss_cls: 120.5544  loss_bbox: 82.8305  loss_dfl: 82.2695

总的batch size变了,lr也得跟着变呀

谢谢,我明天去试试按比例调整一下学习率,请问还有其他参数需要跟着调整吗?

JiayuanWang-JW avatar Mar 13 '24 01:03 JiayuanWang-JW

感谢您开源这么出色的工作。我用YOLO-World-s在objects365v1复现,但是前期的loss跟您的趋势不太一样啊,请问这是正常的吗?我没有V100的卡,我只在两张4090上训练。BS per card 也是16,我没改动其他参数,除了训练数据集改成只用objects365v1和GPU数量为2,我的objects365v1数据集是按照您链接中下载的。谢谢

03/12 17:57:03 - mmengine - INFO - Epoch(train)   [1][   50/19019]  base_lr: 2.0000e-03 lr: 1.7176e-06  eta: 7 days, 21:30:37  time: 0.3587  data_time: 0.0372  memory: 9180  grad_norm: nan  loss: 446.1266  loss_cls: 185.6677  loss_bbox: 123.7541  loss_dfl: 136.7047
03/12 17:57:15 - mmengine - INFO - Epoch(train)   [1][  100/19019]  base_lr: 2.0000e-03 lr: 3.4702e-06  eta: 6 days, 14:30:04  time: 0.2413  data_time: 0.0039  memory: 7734  grad_norm: 565.1302  loss: 445.8083  loss_cls: 184.7159  loss_bbox: 124.9006  loss_dfl: 136.1918
03/12 17:57:28 - mmengine - INFO - Epoch(train)   [1][  150/19019]  base_lr: 2.0000e-03 lr: 5.2228e-06  eta: 6 days, 5:49:49  time: 0.2508  data_time: 0.0038  memory: 6040  grad_norm: 538.9717  loss: 444.6346  loss_cls: 184.0897  loss_bbox: 124.3050  loss_dfl: 136.2399
03/12 17:57:39 - mmengine - INFO - Epoch(train)   [1][  200/19019]  base_lr: 2.0000e-03 lr: 6.9755e-06  eta: 5 days, 23:20:15  time: 0.2345  data_time: 0.0040  memory: 5533  grad_norm: inf  loss: 441.9562  loss_cls: 182.6187  loss_bbox: 123.2948  loss_dfl: 136.0427
03/12 17:57:51 - mmengine - INFO - Epoch(train)   [1][  250/19019]  base_lr: 2.0000e-03 lr: 8.7281e-06  eta: 5 days, 19:42:03  time: 0.2370  data_time: 0.0038  memory: 7546  grad_norm: 506.1350  loss: 438.6803  loss_cls: 180.0693  loss_bbox: 123.0374  loss_dfl: 135.5736
03/12 17:58:04 - mmengine - INFO - Epoch(train)   [1][  300/19019]  base_lr: 2.0000e-03 lr: 1.0481e-05  eta: 5 days, 18:12:13  time: 0.2475  data_time: 0.0039  memory: 5693  grad_norm: 466.7960  loss: 435.8879  loss_cls: 177.5164  loss_bbox: 123.7570  loss_dfl: 134.6145
03/12 17:58:15 - mmengine - INFO - Epoch(train)   [1][  350/19019]  base_lr: 2.0000e-03 lr: 1.2233e-05  eta: 5 days, 16:09:50  time: 0.2347  data_time: 0.0039  memory: 6253  grad_norm: 450.4666  loss: 429.7469  loss_cls: 173.1966  loss_bbox: 123.0853  loss_dfl: 133.4649
03/12 17:58:27 - mmengine - INFO - Epoch(train)   [1][  400/19019]  base_lr: 2.0000e-03 lr: 1.3986e-05  eta: 5 days, 14:47:06  time: 0.2370  data_time: 0.0037  memory: 6053  grad_norm: 460.2511  loss: 422.2496  loss_cls: 168.1797  loss_bbox: 121.7420  loss_dfl: 132.3279
03/12 17:58:39 - mmengine - INFO - Epoch(train)   [1][  450/19019]  base_lr: 2.0000e-03 lr: 1.5739e-05  eta: 5 days, 14:15:45  time: 0.2463  data_time: 0.0040  memory: 6173  grad_norm: 526.7233  loss: 415.7690  loss_cls: 164.0228  loss_bbox: 120.9978  loss_dfl: 130.7484
03/12 17:58:51 - mmengine - INFO - Epoch(train)   [1][  500/19019]  base_lr: 2.0000e-03 lr: 1.7491e-05  eta: 5 days, 13:11:15  time: 0.2339  data_time: 0.0037  memory: 8267  grad_norm: 540.6158  loss: 411.8747  loss_cls: 161.5670  loss_bbox: 120.9805  loss_dfl: 129.3272
03/12 17:59:03 - mmengine - INFO - Epoch(train)   [1][  550/19019]  base_lr: 2.0000e-03 lr: 1.9244e-05  eta: 5 days, 12:17:28  time: 0.2336  data_time: 0.0040  memory: 5800  grad_norm: 532.7258  loss: 404.0686  loss_cls: 158.0140  loss_bbox: 118.3286  loss_dfl: 127.7259
03/12 17:59:15 - mmengine - INFO - Epoch(train)   [1][  600/19019]  base_lr: 2.0000e-03 lr: 2.0997e-05  eta: 5 days, 11:37:55  time: 0.2356  data_time: 0.0039  memory: 6546  grad_norm: 513.6272  loss: 400.2819  loss_cls: 156.8279  loss_bbox: 117.0981  loss_dfl: 126.3559
03/12 17:59:27 - mmengine - INFO - Epoch(train)   [1][  650/19019]  base_lr: 2.0000e-03 lr: 2.2749e-05  eta: 5 days, 11:43:15  time: 0.2515  data_time: 0.0039  memory: 9239  grad_norm: 511.9093  loss: 394.8929  loss_cls: 152.5437  loss_bbox: 117.8386  loss_dfl: 124.5105
03/12 17:59:39 - mmengine - INFO - Epoch(train)   [1][  700/19019]  base_lr: 2.0000e-03 lr: 2.4502e-05  eta: 5 days, 11:16:46  time: 0.2378  data_time: 0.0040  memory: 5706  grad_norm: 513.3235  loss: 388.5590  loss_cls: 151.2647  loss_bbox: 115.0196  loss_dfl: 122.2748
03/12 17:59:51 - mmengine - INFO - Epoch(train)   [1][  750/19019]  base_lr: 2.0000e-03 lr: 2.6254e-05  eta: 5 days, 10:48:51  time: 0.2355  data_time: 0.0041  memory: 7240  grad_norm: 514.5650  loss: 383.7197  loss_cls: 150.8671  loss_bbox: 112.1987  loss_dfl: 120.6540
03/12 18:00:03 - mmengine - INFO - Epoch(train)   [1][  800/19019]  base_lr: 2.0000e-03 lr: 2.8007e-05  eta: 5 days, 10:51:38  time: 0.2492  data_time: 0.0040  memory: 6333  grad_norm: 508.5694  loss: 376.1887  loss_cls: 148.2612  loss_bbox: 109.8921  loss_dfl: 118.0354
03/12 18:00:15 - mmengine - INFO - Epoch(train)   [1][  850/19019]  base_lr: 2.0000e-03 lr: 2.9760e-05  eta: 5 days, 10:38:19  time: 0.2408  data_time: 0.0041  memory: 5920  grad_norm: 480.6808  loss: 369.1250  loss_cls: 145.7235  loss_bbox: 107.7232  loss_dfl: 115.6783
03/12 18:00:27 - mmengine - INFO - Epoch(train)   [1][  900/19019]  base_lr: 2.0000e-03 lr: 3.1512e-05  eta: 5 days, 10:20:42  time: 0.2375  data_time: 0.0038  memory: 5946  grad_norm: 458.0789  loss: 358.9182  loss_cls: 141.4935  loss_bbox: 105.1733  loss_dfl: 112.2514
03/12 18:00:40 - mmengine - INFO - Epoch(train)   [1][  950/19019]  base_lr: 2.0000e-03 lr: 3.3265e-05  eta: 5 days, 10:23:21  time: 0.2485  data_time: 0.0039  memory: 5800  grad_norm: 462.1731  loss: 353.5276  loss_cls: 141.4663  loss_bbox: 102.6783  loss_dfl: 109.3830
03/12 18:00:51 - mmengine - INFO - Exp name: yolo_world_v2_s_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_train_lvis_minival_20240312_175448
03/12 18:00:51 - mmengine - INFO - Epoch(train)   [1][ 1000/19019]  base_lr: 2.0000e-03 lr: 3.5018e-05  eta: 5 days, 10:03:59  time: 0.2348  data_time: 0.0038  memory: 6706  grad_norm: 437.9579  loss: 348.6449  loss_cls: 140.3549  loss_bbox: 101.0544  loss_dfl: 107.2357
03/12 18:01:03 - mmengine - INFO - Epoch(train)   [1][ 1050/19019]  base_lr: 2.0000e-03 lr: 3.6770e-05  eta: 5 days, 9:47:01  time: 0.2352  data_time: 0.0038  memory: 6786  grad_norm: 405.5232  loss: 341.1842  loss_cls: 137.7237  loss_bbox: 99.0177  loss_dfl: 104.4427
03/12 18:01:15 - mmengine - INFO - Epoch(train)   [1][ 1100/19019]  base_lr: 2.0000e-03 lr: 3.8523e-05  eta: 5 days, 9:27:24  time: 0.2323  data_time: 0.0038  memory: 5586  grad_norm: 364.3518  loss: 335.0305  loss_cls: 136.1223  loss_bbox: 96.9399  loss_dfl: 101.9682
03/12 18:01:27 - mmengine - INFO - Epoch(train)   [1][ 1150/19019]  base_lr: 2.0000e-03 lr: 4.0276e-05  eta: 5 days, 9:29:59  time: 0.2472  data_time: 0.0039  memory: 6013  grad_norm: 358.9744  loss: 330.3108  loss_cls: 135.4357  loss_bbox: 95.4105  loss_dfl: 99.4646
03/12 18:01:39 - mmengine - INFO - Epoch(train)   [1][ 1200/19019]  base_lr: 2.0000e-03 lr: 4.2028e-05  eta: 5 days, 9:26:46  time: 0.2430  data_time: 0.0038  memory: 11920  grad_norm: 344.4472  loss: 327.1931  loss_cls: 134.6496  loss_bbox: 94.8774  loss_dfl: 97.6660
03/12 18:01:51 - mmengine - INFO - Epoch(train)   [1][ 1250/19019]  base_lr: 2.0000e-03 lr: 4.3781e-05  eta: 5 days, 9:11:37  time: 0.2334  data_time: 0.0037  memory: 5560  grad_norm: 337.1382  loss: 321.7565  loss_cls: 132.5439  loss_bbox: 92.8661  loss_dfl: 96.3464
03/12 18:02:03 - mmengine - INFO - Epoch(train)   [1][ 1300/19019]  base_lr: 2.0000e-03 lr: 4.5533e-05  eta: 5 days, 9:14:24  time: 0.2472  data_time: 0.0037  memory: 5973  grad_norm: 323.9022  loss: 318.5945  loss_cls: 131.9618  loss_bbox: 91.9790  loss_dfl: 94.6536
03/12 18:02:15 - mmengine - INFO - Epoch(train)   [1][ 1350/19019]  base_lr: 2.0000e-03 lr: 4.7286e-05  eta: 5 days, 9:01:40  time: 0.2341  data_time: 0.0039  memory: 6306  grad_norm: 319.6556  loss: 314.5906  loss_cls: 131.8039  loss_bbox: 89.8786  loss_dfl: 92.9081
03/12 18:02:27 - mmengine - INFO - Epoch(train)   [1][ 1400/19019]  base_lr: 2.0000e-03 lr: 4.9039e-05  eta: 5 days, 8:47:36  time: 0.2322  data_time: 0.0037  memory: 6466  grad_norm: 306.2527  loss: 311.0356  loss_cls: 129.8536  loss_bbox: 89.9395  loss_dfl: 91.2425
03/12 18:02:39 - mmengine - INFO - Epoch(train)   [1][ 1450/19019]  base_lr: 2.0000e-03 lr: 5.0791e-05  eta: 5 days, 8:54:51  time: 0.2508  data_time: 0.0039  memory: 6013  grad_norm: 296.1577  loss: 307.4349  loss_cls: 129.2732  loss_bbox: 88.2924  loss_dfl: 89.8693
03/12 18:02:51 - mmengine - INFO - Epoch(train)   [1][ 1500/19019]  base_lr: 2.0000e-03 lr: 5.2544e-05  eta: 5 days, 8:44:32  time: 0.2346  data_time: 0.0041  memory: 5746  grad_norm: 294.3796  loss: 302.3804  loss_cls: 126.6819  loss_bbox: 87.4314  loss_dfl: 88.2670
03/12 18:03:03 - mmengine - INFO - Epoch(train)   [1][ 1550/19019]  base_lr: 2.0000e-03 lr: 5.4297e-05  eta: 5 days, 8:36:30  time: 0.2362  data_time: 0.0038  memory: 5826  grad_norm: 284.2909  loss: 300.7494  loss_cls: 126.4133  loss_bbox: 87.3834  loss_dfl: 86.9527
03/12 18:03:15 - mmengine - INFO - Epoch(train)   [1][ 1600/19019]  base_lr: 2.0000e-03 lr: 5.6049e-05  eta: 5 days, 8:41:01  time: 0.2484  data_time: 0.0039  memory: 7093  grad_norm: 286.5743  loss: 296.5825  loss_cls: 125.3398  loss_bbox: 85.3759  loss_dfl: 85.8668
03/12 18:03:27 - mmengine - INFO - Epoch(train)   [1][ 1650/19019]  base_lr: 2.0000e-03 lr: 5.7802e-05  eta: 5 days, 8:35:23  time: 0.2381  data_time: 0.0040  memory: 5573  grad_norm: 272.7181  loss: 293.8952  loss_cls: 125.2692  loss_bbox: 83.8527  loss_dfl: 84.7733
03/12 18:03:39 - mmengine - INFO - Epoch(train)   [1][ 1700/19019]  base_lr: 2.0000e-03 lr: 5.9554e-05  eta: 5 days, 8:27:36  time: 0.2355  data_time: 0.0038  memory: 6200  grad_norm: 260.2163  loss: 292.4793  loss_cls: 124.4377  loss_bbox: 84.3622  loss_dfl: 83.6794
03/12 18:03:51 - mmengine - INFO - Epoch(train)   [1][ 1750/19019]  base_lr: 2.0000e-03 lr: 6.1307e-05  eta: 5 days, 8:21:50  time: 0.2372  data_time: 0.0040  memory: 7253  grad_norm: 272.3926  loss: 289.9523  loss_cls: 123.5043  loss_bbox: 83.4889  loss_dfl: 82.9591
03/12 18:04:03 - mmengine - INFO - Epoch(train)   [1][ 1800/19019]  base_lr: 2.0000e-03 lr: 6.3060e-05  eta: 5 days, 8:24:34  time: 0.2465  data_time: 0.0037  memory: 6826  grad_norm: 268.1772  loss: 285.6544  loss_cls: 120.5544  loss_bbox: 82.8305  loss_dfl: 82.2695

总的batch size变了,lr也得跟着变呀

谢谢,我明天去试试按比例调整一下学习率,请问还有其他参数需要跟着调整吗?

说实话,你只用O365,不用另外两个数据集;batch size也只有32,与原文512差太多。感觉很难复现得出来。

taofuyu avatar Mar 13 '24 02:03 taofuyu

感谢您开源这么出色的工作。我用YOLO-World-s在objects365v1复现,但是前期的loss跟您的趋势不太一样啊,请问这是正常的吗?我没有V100的卡,我只在两张4090上训练。BS per card 也是16,我没改动其他参数,除了训练数据集改成只用objects365v1和GPU数量为2,我的objects365v1数据集是按照您链接中下载的。谢谢

03/12 17:57:03 - mmengine - INFO - Epoch(train)   [1][   50/19019]  base_lr: 2.0000e-03 lr: 1.7176e-06  eta: 7 days, 21:30:37  time: 0.3587  data_time: 0.0372  memory: 9180  grad_norm: nan  loss: 446.1266  loss_cls: 185.6677  loss_bbox: 123.7541  loss_dfl: 136.7047
03/12 17:57:15 - mmengine - INFO - Epoch(train)   [1][  100/19019]  base_lr: 2.0000e-03 lr: 3.4702e-06  eta: 6 days, 14:30:04  time: 0.2413  data_time: 0.0039  memory: 7734  grad_norm: 565.1302  loss: 445.8083  loss_cls: 184.7159  loss_bbox: 124.9006  loss_dfl: 136.1918
03/12 17:57:28 - mmengine - INFO - Epoch(train)   [1][  150/19019]  base_lr: 2.0000e-03 lr: 5.2228e-06  eta: 6 days, 5:49:49  time: 0.2508  data_time: 0.0038  memory: 6040  grad_norm: 538.9717  loss: 444.6346  loss_cls: 184.0897  loss_bbox: 124.3050  loss_dfl: 136.2399
03/12 17:57:39 - mmengine - INFO - Epoch(train)   [1][  200/19019]  base_lr: 2.0000e-03 lr: 6.9755e-06  eta: 5 days, 23:20:15  time: 0.2345  data_time: 0.0040  memory: 5533  grad_norm: inf  loss: 441.9562  loss_cls: 182.6187  loss_bbox: 123.2948  loss_dfl: 136.0427
03/12 17:57:51 - mmengine - INFO - Epoch(train)   [1][  250/19019]  base_lr: 2.0000e-03 lr: 8.7281e-06  eta: 5 days, 19:42:03  time: 0.2370  data_time: 0.0038  memory: 7546  grad_norm: 506.1350  loss: 438.6803  loss_cls: 180.0693  loss_bbox: 123.0374  loss_dfl: 135.5736
03/12 17:58:04 - mmengine - INFO - Epoch(train)   [1][  300/19019]  base_lr: 2.0000e-03 lr: 1.0481e-05  eta: 5 days, 18:12:13  time: 0.2475  data_time: 0.0039  memory: 5693  grad_norm: 466.7960  loss: 435.8879  loss_cls: 177.5164  loss_bbox: 123.7570  loss_dfl: 134.6145
03/12 17:58:15 - mmengine - INFO - Epoch(train)   [1][  350/19019]  base_lr: 2.0000e-03 lr: 1.2233e-05  eta: 5 days, 16:09:50  time: 0.2347  data_time: 0.0039  memory: 6253  grad_norm: 450.4666  loss: 429.7469  loss_cls: 173.1966  loss_bbox: 123.0853  loss_dfl: 133.4649
03/12 17:58:27 - mmengine - INFO - Epoch(train)   [1][  400/19019]  base_lr: 2.0000e-03 lr: 1.3986e-05  eta: 5 days, 14:47:06  time: 0.2370  data_time: 0.0037  memory: 6053  grad_norm: 460.2511  loss: 422.2496  loss_cls: 168.1797  loss_bbox: 121.7420  loss_dfl: 132.3279
03/12 17:58:39 - mmengine - INFO - Epoch(train)   [1][  450/19019]  base_lr: 2.0000e-03 lr: 1.5739e-05  eta: 5 days, 14:15:45  time: 0.2463  data_time: 0.0040  memory: 6173  grad_norm: 526.7233  loss: 415.7690  loss_cls: 164.0228  loss_bbox: 120.9978  loss_dfl: 130.7484
03/12 17:58:51 - mmengine - INFO - Epoch(train)   [1][  500/19019]  base_lr: 2.0000e-03 lr: 1.7491e-05  eta: 5 days, 13:11:15  time: 0.2339  data_time: 0.0037  memory: 8267  grad_norm: 540.6158  loss: 411.8747  loss_cls: 161.5670  loss_bbox: 120.9805  loss_dfl: 129.3272
03/12 17:59:03 - mmengine - INFO - Epoch(train)   [1][  550/19019]  base_lr: 2.0000e-03 lr: 1.9244e-05  eta: 5 days, 12:17:28  time: 0.2336  data_time: 0.0040  memory: 5800  grad_norm: 532.7258  loss: 404.0686  loss_cls: 158.0140  loss_bbox: 118.3286  loss_dfl: 127.7259
03/12 17:59:15 - mmengine - INFO - Epoch(train)   [1][  600/19019]  base_lr: 2.0000e-03 lr: 2.0997e-05  eta: 5 days, 11:37:55  time: 0.2356  data_time: 0.0039  memory: 6546  grad_norm: 513.6272  loss: 400.2819  loss_cls: 156.8279  loss_bbox: 117.0981  loss_dfl: 126.3559
03/12 17:59:27 - mmengine - INFO - Epoch(train)   [1][  650/19019]  base_lr: 2.0000e-03 lr: 2.2749e-05  eta: 5 days, 11:43:15  time: 0.2515  data_time: 0.0039  memory: 9239  grad_norm: 511.9093  loss: 394.8929  loss_cls: 152.5437  loss_bbox: 117.8386  loss_dfl: 124.5105
03/12 17:59:39 - mmengine - INFO - Epoch(train)   [1][  700/19019]  base_lr: 2.0000e-03 lr: 2.4502e-05  eta: 5 days, 11:16:46  time: 0.2378  data_time: 0.0040  memory: 5706  grad_norm: 513.3235  loss: 388.5590  loss_cls: 151.2647  loss_bbox: 115.0196  loss_dfl: 122.2748
03/12 17:59:51 - mmengine - INFO - Epoch(train)   [1][  750/19019]  base_lr: 2.0000e-03 lr: 2.6254e-05  eta: 5 days, 10:48:51  time: 0.2355  data_time: 0.0041  memory: 7240  grad_norm: 514.5650  loss: 383.7197  loss_cls: 150.8671  loss_bbox: 112.1987  loss_dfl: 120.6540
03/12 18:00:03 - mmengine - INFO - Epoch(train)   [1][  800/19019]  base_lr: 2.0000e-03 lr: 2.8007e-05  eta: 5 days, 10:51:38  time: 0.2492  data_time: 0.0040  memory: 6333  grad_norm: 508.5694  loss: 376.1887  loss_cls: 148.2612  loss_bbox: 109.8921  loss_dfl: 118.0354
03/12 18:00:15 - mmengine - INFO - Epoch(train)   [1][  850/19019]  base_lr: 2.0000e-03 lr: 2.9760e-05  eta: 5 days, 10:38:19  time: 0.2408  data_time: 0.0041  memory: 5920  grad_norm: 480.6808  loss: 369.1250  loss_cls: 145.7235  loss_bbox: 107.7232  loss_dfl: 115.6783
03/12 18:00:27 - mmengine - INFO - Epoch(train)   [1][  900/19019]  base_lr: 2.0000e-03 lr: 3.1512e-05  eta: 5 days, 10:20:42  time: 0.2375  data_time: 0.0038  memory: 5946  grad_norm: 458.0789  loss: 358.9182  loss_cls: 141.4935  loss_bbox: 105.1733  loss_dfl: 112.2514
03/12 18:00:40 - mmengine - INFO - Epoch(train)   [1][  950/19019]  base_lr: 2.0000e-03 lr: 3.3265e-05  eta: 5 days, 10:23:21  time: 0.2485  data_time: 0.0039  memory: 5800  grad_norm: 462.1731  loss: 353.5276  loss_cls: 141.4663  loss_bbox: 102.6783  loss_dfl: 109.3830
03/12 18:00:51 - mmengine - INFO - Exp name: yolo_world_v2_s_vlpan_bn_2e-3_100e_4x8gpus_obj365v1_train_lvis_minival_20240312_175448
03/12 18:00:51 - mmengine - INFO - Epoch(train)   [1][ 1000/19019]  base_lr: 2.0000e-03 lr: 3.5018e-05  eta: 5 days, 10:03:59  time: 0.2348  data_time: 0.0038  memory: 6706  grad_norm: 437.9579  loss: 348.6449  loss_cls: 140.3549  loss_bbox: 101.0544  loss_dfl: 107.2357
03/12 18:01:03 - mmengine - INFO - Epoch(train)   [1][ 1050/19019]  base_lr: 2.0000e-03 lr: 3.6770e-05  eta: 5 days, 9:47:01  time: 0.2352  data_time: 0.0038  memory: 6786  grad_norm: 405.5232  loss: 341.1842  loss_cls: 137.7237  loss_bbox: 99.0177  loss_dfl: 104.4427
03/12 18:01:15 - mmengine - INFO - Epoch(train)   [1][ 1100/19019]  base_lr: 2.0000e-03 lr: 3.8523e-05  eta: 5 days, 9:27:24  time: 0.2323  data_time: 0.0038  memory: 5586  grad_norm: 364.3518  loss: 335.0305  loss_cls: 136.1223  loss_bbox: 96.9399  loss_dfl: 101.9682
03/12 18:01:27 - mmengine - INFO - Epoch(train)   [1][ 1150/19019]  base_lr: 2.0000e-03 lr: 4.0276e-05  eta: 5 days, 9:29:59  time: 0.2472  data_time: 0.0039  memory: 6013  grad_norm: 358.9744  loss: 330.3108  loss_cls: 135.4357  loss_bbox: 95.4105  loss_dfl: 99.4646
03/12 18:01:39 - mmengine - INFO - Epoch(train)   [1][ 1200/19019]  base_lr: 2.0000e-03 lr: 4.2028e-05  eta: 5 days, 9:26:46  time: 0.2430  data_time: 0.0038  memory: 11920  grad_norm: 344.4472  loss: 327.1931  loss_cls: 134.6496  loss_bbox: 94.8774  loss_dfl: 97.6660
03/12 18:01:51 - mmengine - INFO - Epoch(train)   [1][ 1250/19019]  base_lr: 2.0000e-03 lr: 4.3781e-05  eta: 5 days, 9:11:37  time: 0.2334  data_time: 0.0037  memory: 5560  grad_norm: 337.1382  loss: 321.7565  loss_cls: 132.5439  loss_bbox: 92.8661  loss_dfl: 96.3464
03/12 18:02:03 - mmengine - INFO - Epoch(train)   [1][ 1300/19019]  base_lr: 2.0000e-03 lr: 4.5533e-05  eta: 5 days, 9:14:24  time: 0.2472  data_time: 0.0037  memory: 5973  grad_norm: 323.9022  loss: 318.5945  loss_cls: 131.9618  loss_bbox: 91.9790  loss_dfl: 94.6536
03/12 18:02:15 - mmengine - INFO - Epoch(train)   [1][ 1350/19019]  base_lr: 2.0000e-03 lr: 4.7286e-05  eta: 5 days, 9:01:40  time: 0.2341  data_time: 0.0039  memory: 6306  grad_norm: 319.6556  loss: 314.5906  loss_cls: 131.8039  loss_bbox: 89.8786  loss_dfl: 92.9081
03/12 18:02:27 - mmengine - INFO - Epoch(train)   [1][ 1400/19019]  base_lr: 2.0000e-03 lr: 4.9039e-05  eta: 5 days, 8:47:36  time: 0.2322  data_time: 0.0037  memory: 6466  grad_norm: 306.2527  loss: 311.0356  loss_cls: 129.8536  loss_bbox: 89.9395  loss_dfl: 91.2425
03/12 18:02:39 - mmengine - INFO - Epoch(train)   [1][ 1450/19019]  base_lr: 2.0000e-03 lr: 5.0791e-05  eta: 5 days, 8:54:51  time: 0.2508  data_time: 0.0039  memory: 6013  grad_norm: 296.1577  loss: 307.4349  loss_cls: 129.2732  loss_bbox: 88.2924  loss_dfl: 89.8693
03/12 18:02:51 - mmengine - INFO - Epoch(train)   [1][ 1500/19019]  base_lr: 2.0000e-03 lr: 5.2544e-05  eta: 5 days, 8:44:32  time: 0.2346  data_time: 0.0041  memory: 5746  grad_norm: 294.3796  loss: 302.3804  loss_cls: 126.6819  loss_bbox: 87.4314  loss_dfl: 88.2670
03/12 18:03:03 - mmengine - INFO - Epoch(train)   [1][ 1550/19019]  base_lr: 2.0000e-03 lr: 5.4297e-05  eta: 5 days, 8:36:30  time: 0.2362  data_time: 0.0038  memory: 5826  grad_norm: 284.2909  loss: 300.7494  loss_cls: 126.4133  loss_bbox: 87.3834  loss_dfl: 86.9527
03/12 18:03:15 - mmengine - INFO - Epoch(train)   [1][ 1600/19019]  base_lr: 2.0000e-03 lr: 5.6049e-05  eta: 5 days, 8:41:01  time: 0.2484  data_time: 0.0039  memory: 7093  grad_norm: 286.5743  loss: 296.5825  loss_cls: 125.3398  loss_bbox: 85.3759  loss_dfl: 85.8668
03/12 18:03:27 - mmengine - INFO - Epoch(train)   [1][ 1650/19019]  base_lr: 2.0000e-03 lr: 5.7802e-05  eta: 5 days, 8:35:23  time: 0.2381  data_time: 0.0040  memory: 5573  grad_norm: 272.7181  loss: 293.8952  loss_cls: 125.2692  loss_bbox: 83.8527  loss_dfl: 84.7733
03/12 18:03:39 - mmengine - INFO - Epoch(train)   [1][ 1700/19019]  base_lr: 2.0000e-03 lr: 5.9554e-05  eta: 5 days, 8:27:36  time: 0.2355  data_time: 0.0038  memory: 6200  grad_norm: 260.2163  loss: 292.4793  loss_cls: 124.4377  loss_bbox: 84.3622  loss_dfl: 83.6794
03/12 18:03:51 - mmengine - INFO - Epoch(train)   [1][ 1750/19019]  base_lr: 2.0000e-03 lr: 6.1307e-05  eta: 5 days, 8:21:50  time: 0.2372  data_time: 0.0040  memory: 7253  grad_norm: 272.3926  loss: 289.9523  loss_cls: 123.5043  loss_bbox: 83.4889  loss_dfl: 82.9591
03/12 18:04:03 - mmengine - INFO - Epoch(train)   [1][ 1800/19019]  base_lr: 2.0000e-03 lr: 6.3060e-05  eta: 5 days, 8:24:34  time: 0.2465  data_time: 0.0037  memory: 6826  grad_norm: 268.1772  loss: 285.6544  loss_cls: 120.5544  loss_bbox: 82.8305  loss_dfl: 82.2695

总的batch size变了,lr也得跟着变呀

谢谢,我明天去试试按比例调整一下学习率,请问还有其他参数需要跟着调整吗?

说实话,你只用O365,不用另外两个数据集;batch size也只有32,与原文512差太多。感觉很难复现得出来。

我先试试看性能会差多少,作者文章的Table3: Ablations on Pre-training Data给了单独用O365的性能,我之后对比一下。另外,我查阅了一些资料,“当batch size大小改为 K 时,有些人建议学习率更改为 sqrt(K) 或 K,使用相同的学习率也可以。” 学习率我都试试吧,谢谢。

JiayuanWang-JW avatar Mar 13 '24 02:03 JiayuanWang-JW

您好@JiayuanWang-JW, YOLO-World会在loss上乘上batch_size * world_size的weight,所以batch size或者gpu不同,loss weight会有尺度上的差距,我们这边的batch size=16,world_size=32,你可以换算一下。另外,目前YOLO系列采用lossweight调整batch size,调整lr的效果其实被等效了(AdamW特殊),调整lr意义不大。

wondervictor avatar Mar 13 '24 03:03 wondervictor

您好@JiayuanWang-JW, YOLO-World会在loss上乘上batch_size * world_size的weight,所以batch size或者gpu不同,loss weight会有尺度上的差距,我们这边的batch size=16,world_size=32,你可以换算一下。另外,目前YOLO系列采用lossweight调整batch size,调整lr的效果其实被等效了(AdamW特殊),调整lr意义不大。

谢谢,那我先等这个实验跑完看看结果吧,world_size应该是通过get_dist_info()自动获取的,所以我应该也无法改变。 请问有其他参数您认为需要调整的吗?目前我跟您的实验设置就是GPU数量上和型号的差别,每个GPU上的bs也是16。不知道能不能复现出您Table3: Ablations on Pre-training Data中O365的结果。

JiayuanWang-JW avatar Mar 13 '24 03:03 JiayuanWang-JW

可能唯一需要考虑的是weight decay,YOLO也去做了weight decay的scale,我们在32卡上训练的weight decay接近0.2

wondervictor avatar Mar 13 '24 04:03 wondervictor