yolov2-yolov3_PyTorch about loss function

about loss function

Open Broad-sky opened this issue 3 years ago • 10 comments

"txty_loss_function = nn.BCEWithLogitsLoss(reduction='none')" doesn't match the original paper? can you explain it.

thanks!!

Jan 04 '21 08:01 Broad-sky

@Broad-sky

Training with MSELoss for sigmoid output can make model drowning into local minima. Many other yolo re-implementation projects use BCELoss or BCEWithLogitsLoss like this project.

Feb 16 '21 03:02 developer0hye

Yeah，you're right. I tried BCE and MSE for obj loss, but I found that MSE always brought the better result. Therefore I select MSE not BCE.

------------------ 原始邮件 ------------------ 发件人: "yjh0410/yolov2-yolov3_PyTorch" <[email protected]>; 发送时间: 2021年2月16日(星期二) 中午11:49 收件人: "yjh0410/yolov2-yolov3_PyTorch"<[email protected]>; 抄送: "Subscribed"<[email protected]>; 主题: Re: [yjh0410/yolov2-yolov3_PyTorch] about loss function (#40)

MSELoss for sigmoid output can make model drowning into local minima. Many other yolo re-implementation projects use BCELoss or BCEWithLogitsLoss like this project.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Feb 16 '21 03:02 yjh0410

@yjh0410

Hi! I recently developed and distributed python package plotbbox that is the tool to plot pretty bounding box. I share this with you!

Feb 26 '21 14:02 developer0hye

Looks so pretty! Thanks a lot !

Feb 27 '21 02:02 yjh0410

@yjh0410 Thanks a lot! what object detection algorithm do you study now? Do you have recommendations?

Feb 27 '21 07:02 developer0hye

Sorry, I haven't deeply studied object detection for a long time, just optimize existing projects. My tutor is considering letting me to do research on Temporal Action Detection.

Last week, I read the paper of OneNet which removes the post-process including NMS to make the object detection pipeline more concise. The purpose of OneNet is to make one prediction correspond to one goal, so it is unnecessary to deploy NMS to filter bboxes. Maybe, I think that how to make sure that one prediction corresponds to one object rather than multi predictions corresponds to one object (which means we have to use NMS.) is a good research direction.

Feb 27 '21 07:02 yjh0410

@yjh0410 Thanks! Their Idea looks so simple in terms of implementation, but effect is great!

Feb 28 '21 03:02 developer0hye

@yjh0410 @developer0hye

cls_loss = torch.sum(cls_loss_function(pred_cls, gt_cls) * gt_mask) / batch_size

Why divide by batch_size？？ Instead of all samples.

Could you explain that！ Hope to reply，Thanks!

Jun 09 '21 07:06 Broad-sky

@Broad-sky It is known to us that we must normalize the loss by batch size. I also try to divide it by all samples, but it doesn't work.

Jun 10 '21 08:06 yjh0410

@Broad-sky

We only calculate the class loss for positive samples. If we dived its the output of the loss function by the number of all samples, the class loss will be very small value. So, Its gradient value will be close to 0 and training will fail.

Jun 15 '21 15:06 developer0hye

yolov2-yolov3_PyTorch yolov2-yolov3_PyTorch copied to clipboard

about loss function

yolov2-yolov3_PyTorch
yolov2-yolov3_PyTorch copied to clipboard