DiceLoss-PyTorch Wrong code!

num = torch.sum(torch.mul(predict, target)) + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p)) + self.smooth

The add of smooth is done on every index! It should be like this

self.smooth = 1
...
num = torch.sum(torch.mul(predict, target)) + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p)) + self.smooth
...
return 1 - num/den

Apr 30 '19 08:04 Guocode

Emm, a typo? I don't see any differences.

Apr 30 '19 13:04 hubutui

@hubutui sorry, it's your code

        num = torch.sum(torch.mul(predict, target), dim=1) + self.smooth
        den = torch.sum(predict.pow(self.p) + target.pow(self.p), dim=1) + self.smooth

"+" in pytorch of a tensor and a scalar will be done on every index, so if you sum up the num, molecule is the sum of num and smooth multiply with the shape of the sum(not 1 but maybe millions).

Apr 30 '19 13:04 Guocode

and miss multiply 2,

self.smooth = 1
...
num = torch.sum(torch.mul(predict, target))*2 + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p)) + self.smooth
...
return 1 - num/den

May 01 '19 07:05 Guocode

@Guocode Can you just do the following?

Reference the lines in the code of this repo where it is potentially badly calculated.
Explain what should happen (mathematically) and what his code is doing erroneously.
Propose a new code.

All in the same comment? It is hard for me to follow and understand what you are trying to tell me.

@hubutui Can you review this issue? I was planning to use your loss on a segmentation deep learning pipeline.

Aug 06 '19 20:08 asciidiego

@diegovincent I have pointed out the difference, and I have released my code on my page, you can have a look at that.

Aug 13 '19 12:08 Guocode

Let me try to work this out, this page:

num = torch.sum(torch.mul(predict, target), dim=1) + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p), dim=1) + self.smooth

diegovincent's page

num = torch.sum(torch.mul(predict, target))*2 + self.smooth
den = torch.sum(predict.pow(self.p) + target.pow(self.p)) + self.smooth

So adds a *2 and removes dim=2.

I think diegovincent is saying that if you sum on dimension two only, but leave the others dimension, giving you a tensor of shape, say, [16, 4]. Now if you add smooth, pytorch will expand it's shape to [16, 4] and add it to each element. The results is you've added it many times.

The alternative is to sum over all elements first, and then add smooth. But this changes the equation, and you're not actually calculating the dice loss for each element, but for the sum of all elements in all batches. There may be disagvantages to backprop if you sum too early as the gradient isn't traces through the dice loss of each element. The only disadvantage I see if adding smooth to each element is that it smooths more. But you can [actually measure] the effect smoothing has on the output, and it gives an outputs similar to MSELoss.

Also I implemented dice loss in the same way in this gist and quite a few people reviewed it.

I think it's fine as it is.

Dec 01 '19 23:12 wassname

DiceLoss-PyTorch DiceLoss-PyTorch copied to clipboard

Wrong code!

DiceLoss-PyTorch
DiceLoss-PyTorch copied to clipboard