torchbiomed
torchbiomed copied to clipboard
Dice loss backward function is wrong?
Hi, In the backward function of dice loss class, is this statement wrong? grad_input = torch.cat((torch.mul(dDice, -grad_output[0]), torch.mul(dDice, grad_output[0])), 0)
I think it should expand the dDice from one dimension to two dimension, and then concat them in the second dimension.
@ljpadam Hi, Adam LI. I run into a bug here. Can you show the your version of backward for dice loss. Thank you.
the dice loss may be wrong, I think the statement should be change to: grad_input = torch.cat((torch.mul(dDice, grad_output[0]),torch.mul(dDice, -grad_output[0])), dim=1) ?
@ljpadam @anewlearner @JDwangmo Did you guys find a proper implementation that works fine?
Found the right implementation. Just use these two lines in backward function dDice = torch.add(torch.mul(gt, 2), torch.mul(pred, -4)) grad_input = torch.cat((torch.mul(torch.unsqueeze(dDice,1), grad_output[0]), torch.mul(torch.unsqueeze(dDice,1), -grad_output[0])), dim = 1)
@hamidfarhid Thanks, it works
@Victor-2015 @hfarhidzadeh Could you show your dice loss code? The above code can't run because I meet another problem. `/data1/xbqfile/vnet.pytorch-master/torchbiomed/loss.py in backward(self, grad_output) 55 IoU2 = intersect/(union*union) 56 pred = torch.mul(input[:, 1], IoU2) ---> 57 dDice = torch.add(torch.mul(gt, 2), torch.mul(pred, -4)) 58 #grad_input = torch.cat((torch.mul(dDice, -grad_output[0]), 59 #torch.mul(dDice, grad_output[0])), 0)
RuntimeError: arguments are located on different GPUs at /pytorch/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:269`
do you know how to solve it?
I find the solution of my problem.
previous GPU: gpu_ids=[2,3] , I change it to :gpu_ids=[0,1]. The bug disappear.
it maybe the bug of pytorch. It seems that this error only happens when device_ids[0] is not 0. like this
I have change all xx.cuda() to xx.cuda(device=gpu_ids[0]) , but still get error .So I change gpu_ids ,it work .
grad_input = torch.cat((torch.mul(torch.unsqueeze(dDice,1), grad_output[0]), torch.mul(torch.unsqueeze(dDice,1), -grad_output[0])), dim = 1)
@hfarhidzadeh Hi, why we need to cat the two part? Why not just return dDice * grad_output ? Thx.
grad_input = torch.cat((torch.mul(torch.unsqueeze(dDice,1), grad_output[0]), torch.mul(torch.unsqueeze(dDice,1), -grad_output[0])), dim = 1)
@hfarhidzadeh Hi, why we need to cat the two part? Why not just return dDice * grad_output ? Thx.
I think the way they implemented, in the end we have one matrix with two columns, not a vector. It is on my top head and don't remember the details. :)
grad_input = torch.cat((torch.mul(torch.unsqueeze(dDice,1), grad_output[0]), torch.mul(torch.unsqueeze(dDice,1), -grad_output[0])), dim = 1)
@hfarhidzadeh Hi, why we need to cat the two part? Why not just return dDice * grad_output ? Thx.
I think the way they implemented, in the end we have one matrix with two columns, not a vector. It is on my top head and don't remember the details. :)
Thanks. 👍