CRAFT-Reimplementation icon indicating copy to clipboard operation
CRAFT-Reimplementation copied to clipboard

Question on Maploss

Open ThisIsIsaac opened this issue 5 years ago • 2 comments

While reading your loss function, I came across a few questions:

  1. I think def single_image_loss defined in mseloss.py seems to be a misnomer. The function iterates through outputs for the entire batch of images, not just a single image. It should be something like def batch_cumulative_loss.

  2. The only thing I read from the paper was that it used MSE loss. So I was expecting a loss function to be something like:

loss_fn = torch.nn.MSELoss(reduce=False, size_average=False)

loss1 = torch.mean(loss_fn(p_gh, gh_label))
loss2 = torch.mean(loss_fn(p_gah, gah_label))

# multiply by mask

return loss1/batch_size + loss2/batch_size

However, in your loss function, you cap the number of negative pixel losses depending on the number of positive pixels. Why do you do this? If you have some outside source that led you to use this, I would really appreciate if you share with me!

ThisIsIsaac avatar Nov 12 '19 04:11 ThisIsIsaac

question 2 is answered in this issue by the author.

As you mentioned, the positive pixels are regions where the region score GT is larger than 0.1, and pos-neg ratio is 1:3.

  1. When the number of positive pixels is 0, you get 500 largest negative pixels. Where did you get this 500 from? Is it from your trial-and-error?
  2. what does OHEM stand for? I can't seem to be able to google it

ThisIsIsaac avatar Nov 12 '19 04:11 ThisIsIsaac

@ThisIsIsaac The 500 largest negative pixesl is not sensitive to the performance,you can choose 10000 or 40000 both ok. OHEM is just that the model choose positive an d negative sample with ratio 1:3 to train. The paper is named Training Region-based Object Detectors with Online Hard Example Mining

backtime92 avatar Nov 12 '19 14:11 backtime92