DTFD-MIL wrong gradient calculation code?

Hi @hrzhang1123,

With the torch version 1.12, the code raises

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [128, 1]], which is output 0 of AsStridedBackward0, is at version 6; expected version 5 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

at

 64 loss1 = ce_cri(gSlidePred, tslideLabel).mean()
 65 optimizer1.zero_grad()

---> 66 loss1.backward() 67 torch.nn.utils.clip_grad_norm_(attCls.parameters(), 5) 68 optimizer1.step()

This can be resolved by calling optimizer0.step() right before optimizer1.step(). This makes sense, because updating the weights before performing backward propagations of loss1 would result in incorrect weights being used. Can you consider reviewing on this?

Nov 15 '22 01:11 Treeboy2762

Hi @Treeboy2762 , I also met this error when I was using newer version of Pytorch. The bug would disappear if I used pytorch 1.4.

Nov 20 '22 06:11 Tonyboy999

Hi @Treeboy2762 , I also met this error when I was using newer version of Pytorch. The bug would disappear if I used pytorch 1.4.

Thank you. I also solved this problem by changing to use pytorch 1.4.

Dec 01 '22 11:12 Dootmaan

@Furyboyy Thanks! This temporarily solves the problem, but I am not sure if it's an appropriate solution..

Dec 02 '22 09:12 Treeboy2762

I have an solution to adjust the position of optimizer0.step(), I can run the code.

Dec 09 '22 07:12 jasonyin20

I have an solution to adjust the position of optimizer0.step(), I can run the code.

Or we can use detach() method at: slide_pseudo_feat.append(af_inst_feat.detach()) slide_pseudo_feat.append(max_inst_feat.detach()) slide_pseudo_feat.append(MaxMin_inst_feat.detach()) also works

Jan 12 '24 14:01 weiaicunzai

I have an solution to adjust the position of optimizer0.step(), I can run the code.

Or we can use detach() method at: slide_pseudo_feat.append(af_inst_feat.detach()) slide_pseudo_feat.append(max_inst_feat.detach()) slide_pseudo_feat.append(MaxMin_inst_feat.detach()) also works

I haven't tested the performance of this code yet, but not bugs are reported

Mar 01 '24 09:03 weiaicunzai

DTFD-MIL DTFD-MIL copied to clipboard

wrong gradient calculation code?

DTFD-MIL
DTFD-MIL copied to clipboard