yolo2-pytorch CrossEntropy Loss or MSELoss in cls

CrossEntropy Loss or MSELoss in cls_closs?

Open xmfbit opened this issue 8 years ago • 2 comments

Good work. But I am confused about how to calculate cls loss. It seems that you used MSELoss in your code. However, I find that in darknet, when computing gradient, the formula seems like cross-entropy: see https://github.com/pjreddie/darknet/blob/master/src/region_layer.c#L130. Besides, in the paper YOLO9000, the author seemed to use MSELoss just like what he did in YOLOV1.

So could you check this? Thank you.

Aug 17 '17 10:08 xmfbit

OK...I see. You used one-hot vector as gt_classes. But a new question is that: the gradient (gt_class - prob) should be passed directly to the output of the final conv-layer (let's call it x), while you used softmax(x) in the code (prob_pred = F.softmax(score_pred.view(-1, score_pred.size()[-1])).view_as(score_pred)), then the autograd mechanism will bp through softmax operation. Is it right?

Aug 17 '17 10:08 xmfbit

Hello xmfbit,

Did you manage to understand the loss function? I am struggling with that as well.

Oct 17 '17 20:10 AndresPMD

yolo2-pytorch yolo2-pytorch copied to clipboard

CrossEntropy Loss or MSELoss in cls_closs?

yolo2-pytorch
yolo2-pytorch copied to clipboard