lightweight_openpose
lightweight_openpose copied to clipboard
请问下你训练的loss
你好, 作者, 就不用英文沟通了, 想请问下, 您训练时候的aichallenger14个点loss是多少, 这是我的loss感觉不太对劲
EpochID: 50 Iter: 140 stage1_pafs_loss: 109600.7609375 stage1_heatmaps_loss: 28056.8501953125 stage2_pafs_loss: 101297.615625 stage2_heatmaps_loss: 25829.1998046875 stage3_pafs_loss: 100676.89765625 stage3_heatmaps_loss: 25646.5669921875
我用的pytorch, 用您源码训练loss也很大, 2w以上
我的很小啊, 都是在0.01~0.001范围内, 源码用了mse和smooth l1 loss, 数值不可能大的
amitabhama [email protected] 于2019年11月29日周五 上午11:46写道:
我用的pytorch, 用您源码训练loss也很大, 2w以上
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/murdockhou/lightweight_openpose/issues/17?email_source=notifications&email_token=AEMCC7OCXK4IO2LVLACITWLQWCGANA5CNFSM4JS2RKY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFNZLLI#issuecomment-559650221, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEMCC7PYYTPT4FEOJDECXC3QWCGANANCNFSM4JS2RKYQ .
收到, 我这边继续调试下, 看看loss
收到, 我这边继续调试下, 看看loss
你好,可以留个联系方式,交流下吗
收到, 我这边继续调试下, 看看loss
你好,可以留个联系方式,交流下吗
wechat: mahongyun177, 我也不是特别懂的.....还得问问作者大佬
@murdockhou Iter: 130000 stage1_pafs_loss: 0.013112904597073793 stage1_heatmaps_loss: 0.00752411549910903 stage2_pafs_loss: 0.011273426562547683 stage2_heatmaps_loss: 0.00575953284278512 stage3_pafs_loss: 0.011314316745847463 stage3_heatmaps_loss: 0.005732348002493381 EpochID: 50 Iter: 140000 stage1_pafs_loss: 0.013068882934749126 stage1_heatmaps_loss: 0.007446729391813278 stage2_pafs_loss: 0.011319325864315033 stage2_heatmaps_loss: 0.005765374982729554 stage3_pafs_loss: 0.011329260561615229 stage3_heatmaps_loss: 0.005744910659268498
请问这样的loss你觉得可以吗
@amitabhama loss具体数值不重要, 重要的是
- 你的loss计算方式是什么样子, loss大小应该有个大概的, 要确保的是这个loss大小数值符合你的loss计算公式
- 随着模型训练, 整体loss是下降趋势并且模型测试的时候有结果出来
具体loss是个什么数值不是很关键, 满足上面两点就可以
@amitabhama loss具体数值不重要, 重要的是
- 你的loss计算方式是什么样子, loss大小应该有个大概的, 要确保的是这个loss大小数值符合你的loss计算公式
- 随着模型训练, 整体loss是下降趋势并且模型测试的时候有结果出来
具体loss是个什么数值不是很关键, 满足上面两点就可以
用的l2_loss, 想问你跑tf,跑了多少个epoch
@amitabhama 我tf跑了大概有50个epoch左右就没再继续训了, 跑去搞其它任务去了,不过之前训练的结果是实际测试起来效果挺好, 但ai val上score得分较差,后来没再跟了.
@murdockhou INFO:tensorflow:Saving dict for global step 13125: global_step = 13125, heatmap = 0.11633993, loss = 1555.1757, paf = 0.11959251
请问下,用你的源码跑的训练, heatmap和paf这两个值是什么意思, 还有这个loss,什么区间是表现还可以的, thx
@amitabhama heatmap paf 定义在这link loss肯定越小越好, 具体哪里会好不是很清楚了
@murdockhou 那想请问下你, 你训练的时候loss大概多少, 这个方便透露吗?以下是我目前训练的结果 INFO:tensorflow:Finished evaluation at 2019-12-10-02:13:41 INFO:tensorflow:Saving dict for global step 15750: global_step = 15750, heatmap = 0.12529747, loss = 1782.9702, paf = 0.13560909 INFO:tensorflow:Saving 'checkpoint_path' summary for global step 15750: /home/majingxiang/pose_tf/lightweight_openpose/checkpoint/model.ckpt-15750 INFO:tensorflow:global_step/sec: 0.228574 INFO:tensorflow:loss = 9307.789, step = 15800 (437.473 sec) INFO:tensorflow:global_step/sec: 0.233062 INFO:tensorflow:loss = 11428.682, step = 15900 (429.077 sec) INFO:tensorflow:global_step/sec: 0.227805 INFO:tensorflow:loss = 10010.404, step = 16000 (438.968 sec) INFO:tensorflow:global_step/sec: 0.229515 INFO:tensorflow:loss = 10120.916, step = 16100 (435.699 sec) INFO:tensorflow:global_step/sec: 0.232309 INFO:tensorflow:loss = 10017.281, step = 16200 (430.470 sec) INFO:tensorflow:global_step/sec: 0.229735 INFO:tensorflow:loss = 10713.017, step = 16300 (435.318 sec) INFO:tensorflow:global_step/sec: 0.240688 INFO:tensorflow:loss = 11633.489, step = 16400 (415.436 sec) INFO:tensorflow:global_step/sec: 0.238022 INFO:tensorflow:loss = 10606.381, step = 16500 (420.130 sec) INFO:tensorflow:global_step/sec: 0.228483 INFO:tensorflow:loss = 10207.916, step = 16600 (437.666 sec) INFO:tensorflow:global_step/sec: 0.228785 INFO:tensorflow:loss = 9776.236, step = 16700 (437.123 sec) INFO:tensorflow:global_step/sec: 0.22944 INFO:tensorflow:loss = 10440.903, step = 16800 (435.817 sec) INFO:tensorflow:global_step/sec: 0.233133 INFO:tensorflow:loss = 10533.533, step = 16900 (428.941 sec) INFO:tensorflow:global_step/sec: 0.234757 INFO:tensorflow:loss = 9553.179, step = 17000 (425.971 sec) INFO:tensorflow:global_step/sec: 0.231954 INFO:tensorflow:loss = 9749.698, step = 17100 (431.128 sec) INFO:tensorflow:global_step/sec: 0.233889 INFO:tensorflow:loss = 9535.672, step = 17200 (427.558 sec) INFO:tensorflow:global_step/sec: 0.234923 INFO:tensorflow:loss = 10243.893, step = 17300 (425.650 sec) INFO:tensorflow:global_step/sec: 0.233815 INFO:tensorflow:loss = 10351.766, step = 17400 (427.698 sec)