Focal-Loss-implement-on-Tensorflow icon indicating copy to clipboard operation
Focal-Loss-implement-on-Tensorflow copied to clipboard

Question on MobileNet SSD architechture with focal loss.

Open MAGI003769 opened this issue 6 years ago • 4 comments

Hello! First of all thanks for your implementation. You're really awesome. My question is how to merger the focal loss with SSD architecture as I'm know working on SSD for my project.

  1. Is it correct that we just replace the original softmax loss by focal loss? Or, it is necessary to apply it to location loss as well?
  2. As the strength of focal loss is to solve the class imbalance, should I remove the the hard negative mining operations mentioned in SSD paper? What's your idea when your implementation?

Thanks a lot for your brilliant work and patience to read these questions. Look forward your reply.

MAGI003769 avatar Mar 22 '18 13:03 MAGI003769

Hi, thanks for your praise.

  1. You just need to replace softmax loss by focal loss of classification.
  2. You can remove hard mining operations, but I choose to change the config file to (num_hard_example=20000, max_neg_per_pos=1000, max_total_detection=20000) instead. Actually you can have a try.

ailias avatar Mar 24 '18 16:03 ailias

Thanks for your answering.

The output shape of original softmax loss tensor is (batch_size, num_box) which means a scalar value for each box. But, in your implementation, the return of focal loss is just a scalar. Is that means one loss value for each batch??? Doesn't the last line of code tf.reduce_sum() need to specify the axis, maybe -1 ? I think each single sample oriented by loss is a box rather than a batch.

Could you please see this problem at your earliest convenience?

MAGI003769 avatar Mar 29 '18 05:03 MAGI003769

Oh, my god. The latest version of models has changed the loss function return value. My code is for previous version(maybe before version 1.2). You just need return per_entry_cross_ent variable rather then tf.reduce_sum() result. Actually, the models has implemented focal loss called 'class SigmoidFocalClassificationLoss(Loss)', you can have a try.

ailias avatar Mar 31 '18 16:03 ailias

hi, @ailias @MAGI003769 , thus I can directly use tensorflow object_detection models api to merge focal loss with SSD. However, I am not sure, whether it can directly be used for multi-label dataset or not? Should I do some modifications? Thanks

AllenDun avatar Feb 15 '19 04:02 AllenDun