edge-connect When I tried to start training, I got an error：RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 512, 4, 4]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect

Thank you very much for your contribution. Your article helps me a lot. As mentioned in the title, I encountered an error at the beginning of my training.The detailed error information is as follows: Traceback (most recent call last): File "E:/our code/edge-connect-master/train.py", line 2, in main(mode=1) File "E:\our code\edge-connect-master\main.py", line 56, in main model.train() File "E:\our code\edge-connect-master\src\edge_connect.py", line 178, in train self.inpaint_model.backward(i_gen_loss, i_dis_loss) File "E:\our code\edge-connect-master\src\models.py", line 259, in backward gen_loss.backward() File "D:\Anaconda\envs\CTSDG\lib\site-packages\torch_tensor.py", line 307, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "D:\Anaconda\envs\CTSDG\lib\site-packages\torch\autograd_init_.py", line 156, in backward allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 512, 4, 4]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Have you ever encountered this mistake in training? I would appreciate it if you could tell me how to solve this problem!

Mar 25 '23 09:03 TT-mouse

Hello, the following solution solved the problem: https://stackoverflow.com/questions/71793678/i-am-running-into-a-gradient-computation-inplace-error

Mar 25 '23 20:03 talyeho

Hello, thank you for your help. I tried to modify the code according to the link you provided, but the same error still occurs.I've seen people say it's a pytorch version issue, but I want to fix that by not demoting the pytorch version.Can you help me?

Mar 26 '23 02:03 TT-mouse

Did you move the steps to be after the backward?

Mar 26 '23 06:03 talyeho

Did you move the steps to be after the backward?

Yes,I changed the code in the backpropagation as follows, but it didn't work.

Mar 26 '23 06:03 TT-mouse

Hello, and I apologize for the delay. This was the issue for me; I believe you should also check if dis_loss and gen_loss are not None before using them.

Mar 29 '23 18:03 talyeho

def backward(self, gen_loss=None, dis_loss=None):
    if gen_loss is not None:
        gen_loss.backward()
    self.gen_optimizer.step()

    if dis_loss is not None:
        dis_loss.backward()
    self.dis_optimizer.step()

You can try it this way, although I ran up this way, but the results are not ideal, I do not know what is the reason, look forward to our follow-up communication.

Mar 31 '23 01:03 Ghost0405

Hello, and I apologize for the delay. This was the issue for me; I believe you should also check if dis_loss and gen_loss are not None before using them.

Thanks for your help, I have solved the problem.

Apr 02 '23 08:04 TT-mouse

def backward(self, gen_loss=None, dis_loss=None):
    if gen_loss is not None:
        gen_loss.backward()
    self.gen_optimizer.step()

    if dis_loss is not None:
        dis_loss.backward()
    self.dis_optimizer.step()
You can try it this way, although I ran up this way, but the results are not ideal, I do not know what is the reason, look forward to our follow-up communication.

In which mode did you not retrain the model well?

Apr 02 '23 08:04 TT-mouse

Hello, and I apologize for the delay. This was the issue for me; I believe you should also check if dis_loss and gen_loss are not None before using them.

Thanks for your help, I have solved the problem.

Could you please tell me how did you solve this problem?Thank you very much!

Apr 16 '23 15:04 wizaaaard

Hello, and I apologize for the delay. This was the issue for me; I believe you should also check if dis_loss and gen_loss are not None before using them.

Thanks for your help, I have solved the problem.

Could you please tell me how did you solve this problem?Thank you very much!

Hello,the backward of both phase networks needs to be modified. I only changed the backward of phase two networks earlier.This might help you!

Apr 17 '23 01:04 TT-mouse

def backward(self, gen_loss=None, dis_loss=None):
    if gen_loss is not None:
        gen_loss.backward()
    self.gen_optimizer.step()

    if dis_loss is not None:
        dis_loss.backward()
    self.dis_optimizer.step()
大家可以这样试试，虽然我是这样跑上去的，但是结果并不理想，不知道是什么原因，期待我们后续的交流。
您在哪种模式下没有很好地重新训练模型？

def backward(self, gen_loss=None, dis_loss=None):
    if gen_loss is not None:
        gen_loss.backward()
    self.gen_optimizer.step()

    if dis_loss is not None:
        dis_loss.backward()
    self.dis_optimizer.step()
You can try it this way, although I ran up this way, but the results are not ideal, I do not know what is the reason, look forward to our follow-up communication.
In which mode did you not retrain the model well?

Sorry to see your reply now，My problem occurs in model = 1，Although, 540,000 rounds of training have been performed, there are still many ambiguities.

Apr 17 '23 11:04 Ghost0405

您好，对于延误，我深表歉意。这对我来说是个问题；我相信您还应该在使用之前检查 dis_loss 和 gen_loss 是否不是 None 。

感谢您的帮助，我已经解决了问题。

你能告诉我你是如何解决这个问题的吗？非常感谢！

您好，双相网络的backward需要修改。我之前只改变了第二阶段网络的落后。这可能对你有帮助！

Thanks for your reply, I only use the first stage of EdgeConnect, so I don't think it's a big problem if the second stage is not modified.

Hello, and I apologize for the delay. This was the issue for me; I believe you should also check if dis_loss and gen_loss are not None before using them.

Thanks for your help, I have solved the problem.

Could you please tell me how did you solve this problem?Thank you very much!

Hello,the backward of both phase networks needs to be modified. I only changed the backward of phase two networks earlier.This might help you!

Thanks for your reply, I only use the first stage of EdgeConnect, so I don't think it's a big problem if the second stage is not modified.

Apr 17 '23 11:04 Ghost0405

hello！I change the models.py as follow：

def backward(self, gen_loss=None, dis_loss=None):
    if dis_loss is not None:
        dis_loss.backward(retain_graph=True)
    if gen_loss is not None:
        gen_loss.backward()
    self.dis_optimizer.step()
    self.gen_optimizer.step()

def backward(self, gen_loss=None, dis_loss=None):
    dis_loss.backward(retain_graph=True)
    gen_loss.backward()
    self.dis_optimizer.step()
    self.gen_optimizer.step()

And I use the 3 stage of EdgeConnect run 20 epoch by pytorch 1.7. The outputs have no problem, I see. So, I means you can try it.

Apr 18 '23 08:04 manlupanshan

hello！I change the models.py as follow：
def backward(self, gen_loss=None, dis_loss=None):
    if dis_loss is not None:
        dis_loss.backward(retain_graph=True)
    if gen_loss is not None:
        gen_loss.backward()
    self.dis_optimizer.step()
    self.gen_optimizer.step()

def backward(self, gen_loss=None, dis_loss=None):
    dis_loss.backward(retain_graph=True)
    gen_loss.backward()
    self.dis_optimizer.step()
    self.gen_optimizer.step()
And I use the 3 stage of EdgeConnect run 20 epoch by pytorch 1.7. The outputs have no problem, I see. So, I means you can try it. Hello, I made the same changes, but the results were very poor and there was a significant difference in data compared to the original text

Apr 20 '23 12:04 CindyzhangKexin

hello！I change the models.py as follow：
def backward(self, gen_loss=None, dis_loss=None):
    if dis_loss is not None:
        dis_loss.backward(retain_graph=True)
    if gen_loss is not None:
        gen_loss.backward()
    self.dis_optimizer.step()
    self.gen_optimizer.step()

def backward(self, gen_loss=None, dis_loss=None):
    dis_loss.backward(retain_graph=True)
    gen_loss.backward()
    self.dis_optimizer.step()
    self.gen_optimizer.step()
And I use the 3 stage of EdgeConnect run 20 epoch by pytorch 1.7. The outputs have no problem, I see. So, I means you can try it. Hello, I made the same changes, but the results were very poor and there was a significant difference in data compared to the original text

Hardly to say why this problem is happening. I used Stage 1 and Stage 3 again. And this network could generate not bad results. The backward() should be before optimizer.step(). So I change backward(self, gen_loss=None, dis_loss=None) like this. You may need to rethink why it does not work. Or read the following solution solved the problem: https://stackoverflow.com/questions/71793678/i-am-running-into-a-gradient-computation-inplace-error It may help you, bye.

May 03 '23 14:05 manlupanshan

edge-connect edge-connect copied to clipboard

edge-connect
edge-connect copied to clipboard