torchbiomed icon indicating copy to clipboard operation
torchbiomed copied to clipboard

Dice loss backward function is wrong?

Open ljpadam opened this issue 8 years ago • 10 comments

Hi, In the backward function of dice loss class, is this statement wrong? grad_input = torch.cat((torch.mul(dDice, -grad_output[0]), torch.mul(dDice, grad_output[0])), 0)

I think it should expand the dDice from one dimension to two dimension, and then concat them in the second dimension.

ljpadam avatar Jul 15 '17 03:07 ljpadam

@ljpadam Hi, Adam LI. I run into a bug here. Can you show the your version of backward for dice loss. Thank you.

anewlearner avatar Oct 23 '17 03:10 anewlearner

the dice loss may be wrong, I think the statement should be change to: grad_input = torch.cat((torch.mul(dDice, grad_output[0]),torch.mul(dDice, -grad_output[0])), dim=1) ?

JDwangmo avatar Oct 31 '17 02:10 JDwangmo

@ljpadam @anewlearner @JDwangmo Did you guys find a proper implementation that works fine?

hfarhidzadeh avatar Jan 14 '18 23:01 hfarhidzadeh

Found the right implementation. Just use these two lines in backward function dDice = torch.add(torch.mul(gt, 2), torch.mul(pred, -4)) grad_input = torch.cat((torch.mul(torch.unsqueeze(dDice,1), grad_output[0]), torch.mul(torch.unsqueeze(dDice,1), -grad_output[0])), dim = 1)

hfarhidzadeh avatar Jan 15 '18 17:01 hfarhidzadeh

@hamidfarhid Thanks, it works

ghost avatar Jan 16 '18 11:01 ghost

@Victor-2015 @hfarhidzadeh Could you show your dice loss code? The above code can't run because I meet another problem. `/data1/xbqfile/vnet.pytorch-master/torchbiomed/loss.py in backward(self, grad_output) 55 IoU2 = intersect/(union*union) 56 pred = torch.mul(input[:, 1], IoU2) ---> 57 dDice = torch.add(torch.mul(gt, 2), torch.mul(pred, -4)) 58 #grad_input = torch.cat((torch.mul(dDice, -grad_output[0]), 59 #torch.mul(dDice, grad_output[0])), 0)

RuntimeError: arguments are located on different GPUs at /pytorch/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:269`

do you know how to solve it?

xubaoquan33 avatar May 15 '18 08:05 xubaoquan33

I find the solution of my problem.
previous GPU: gpu_ids=[2,3] , I change it to :gpu_ids=[0,1]. The bug disappear. it maybe the bug of pytorch. It seems that this error only happens when device_ids[0] is not 0. like this I have change all xx.cuda() to xx.cuda(device=gpu_ids[0]) , but still get error .So I change gpu_ids ,it work .

xubaoquan33 avatar May 15 '18 12:05 xubaoquan33

grad_input = torch.cat((torch.mul(torch.unsqueeze(dDice,1), grad_output[0]), torch.mul(torch.unsqueeze(dDice,1), -grad_output[0])), dim = 1)

@hfarhidzadeh Hi, why we need to cat the two part? Why not just return dDice * grad_output ? Thx.

JasonLeeSJTU avatar Jun 15 '19 16:06 JasonLeeSJTU

grad_input = torch.cat((torch.mul(torch.unsqueeze(dDice,1), grad_output[0]), torch.mul(torch.unsqueeze(dDice,1), -grad_output[0])), dim = 1)

@hfarhidzadeh Hi, why we need to cat the two part? Why not just return dDice * grad_output ? Thx.

I think the way they implemented, in the end we have one matrix with two columns, not a vector. It is on my top head and don't remember the details. :)

hfarhidzadeh avatar Jun 15 '19 16:06 hfarhidzadeh

grad_input = torch.cat((torch.mul(torch.unsqueeze(dDice,1), grad_output[0]), torch.mul(torch.unsqueeze(dDice,1), -grad_output[0])), dim = 1)

@hfarhidzadeh Hi, why we need to cat the two part? Why not just return dDice * grad_output ? Thx.

I think the way they implemented, in the end we have one matrix with two columns, not a vector. It is on my top head and don't remember the details. :)

Thanks. 👍

JasonLeeSJTU avatar Jun 16 '19 04:06 JasonLeeSJTU