fssd.pytorch icon indicating copy to clipboard operation
fssd.pytorch copied to clipboard

loss.backward()

Open DW1HH opened this issue 6 years ago • 8 comments

Traceback (most recent call last): File "train.py", line 268, in train() File "train.py", line 233, in train loss.backward() File "/home/huhuai/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 93, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/huhuai/anaconda3/lib/python3.6/site-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

DW1HH avatar Mar 29 '19 10:03 DW1HH

@DW1HH The reason that leads to the error is the model file contain the in_place operation, you need trun off the switch of in_place in models/fssd_vgg.py

dlyldxwl avatar Apr 15 '19 01:04 dlyldxwl

@DW1HH The reason that leads to the error is the model file contain the in_place operation, you need trun off the switch of in_place in models/fssd_vgg.py

sorry, i also meet this problem,when i train voc2007, could you tell how to fix it carefully(细致的=-=)?

rw1995 avatar May 30 '19 06:05 rw1995

like this? ` Loading base network... Initializing weights... train.py:98: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_. init.kaiming_normal(m.state_dict()[key], mode='fan_out') Loading Dataset... Training FSSD_VGG on VOC0712 avg_loss_list: [0.0] /home/rw/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/nn/functional.py:2539: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) /home/rw/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/nn/_reduction.py:46: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead. warnings.warn(warning.format(ret)) Traceback (most recent call last): File "train.py", line 262, in train() File "train.py", line 227, in train loss.backward() File "/home/rw/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/tensor.py", line 107, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/rw/anaconda3/envs/pytorch0.4.0/lib/python3.5/site-packages/torch/autograd/init.py", line 93, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [16, 256, 9, 9]], which is output 0 of ReluBackward1, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

`

rw1995 avatar May 30 '19 06:05 rw1995

i replace fssd_vgg.py for k, v in enumerate(self.extras): # x = F.relu(v(x), inplace=True) x = F.relu(v(x)) and it works ,but found L: nan C: nan S: nan,so i think what i do is wrong

rw1995 avatar May 30 '19 11:05 rw1995

I got the same issue. After changed inplace=False, it returned nan C: nan S: nan while training

naviocean avatar Oct 31 '19 04:10 naviocean

i replace fssd_vgg.py for k, v in enumerate(self.extras): # x = F.relu(v(x), inplace=True) x = F.relu(v(x)) and it works ,but found L: nan C: nan S: nan,so i think what i do is wrong

hi,my friend, have you solved this problem? Those Usewarings can be corrected easily. if you are still trouble with them, you can contact me for the solutions.

zhaohao0404 avatar Jan 04 '20 02:01 zhaohao0404

i replace fssd_vgg.py for k, v in enumerate(self.extras): # x = F.relu(v(x), inplace=True) x = F.relu(v(x)) and it works ,but found L: nan C: nan S: nan,so i think what i do is wrong

hi,my friend, have you solved this problem? Those Usewarings can be corrected easily. if you are still trouble with them, you can contact me for the solutions.

Hello, I also encountered a similar problem, how did you solve it? Thank you @chaomartin @naviocean @rw1995

Shawn0Hsu avatar Dec 01 '20 14:12 Shawn0Hsu

I got the same issue. After changed inplace=False, it returned nan C: nan S: nan while training

i have got this problem too, have you solve it yet?

YXB-NKU avatar Sep 29 '21 09:09 YXB-NKU