edge-connect icon indicating copy to clipboard operation
edge-connect copied to clipboard

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Open renleidewenming opened this issue 5 years ago • 4 comments

When I train the edge model, RuntimeError appears as follows, I don't know how to deal with it. Anyoneone helps me? My pytorch version is 1.0.0a0+90737f7

Traceback (most recent call last): File "train.py", line 2, in main(mode=1) File "/home/share/edge-connect/main.py", line 56, in main model.train() File "/home/share/edge-connect/src/edge_connect.py", line 115, in train self.edge_model.backward(gen_loss, dis_loss) File "/home/share/edge-connect/src/models.py", line 145, in backward dis_loss.backward() File "/usr/local/lib/python3.5/dist-packages/torch/tensor.py", line 102, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/usr/local/lib/python3.5/dist-packages/torch/autograd/init.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

renleidewenming avatar Oct 25 '19 08:10 renleidewenming

Hi, I met the same problem. Have you solved it?

Guanyunlph avatar Oct 20 '20 03:10 Guanyunlph

Hi, I met the same problem. Have you solved it?

Hi, do you have a solution?

g-h-anna avatar Mar 15 '21 14:03 g-h-anna

I ran into the same problem when trying to train the network with a newer PyTorch version (1.9.0). It looks like it is connected to this issue: https://github.com/pytorch/pytorch/issues/39141

According to what is described in this link, I adjusted the backward functions in the edge and inpaint model so that both backward() passes are done before the optimizer steps, e.g.:

def backward(self, gen_loss=None, dis_loss=None):
  dis_loss.backward()
  gen_loss.backward()
  
  self.dis_optimizer.step()
  self.gen_optimizer.step()

Since the optimizer checks causing this error seem to not have been implemented in PyTorch versions < 1.5.0, it might be that the computed gradients are actually not correct when using earlier versions.

cgsaxner avatar Jul 21 '21 11:07 cgsaxner

The problem happens because of the pytorch version > 1.5.0. This project uses pytorch version 1.0, try sticking to that and torchvision 0.3.0 to run it.

dhruvagarwal avatar Jul 27 '21 11:07 dhruvagarwal