Implementation-MolGAN-PyTorch RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation:

Thank you for your exciting work! But when I was running the GAN model, I encountered a problem. ———————————————————————— Start training... [W python_anomaly_mode.cpp:104] Warning: Error detected in AddmmBackward. Traceback of forward call that caused the error: File "main_gan.py", line 65, in main(config) File "main_gan.py", line 58, in main solver.train_and_validate() File "/workspace/solver_gan.py", line 221, in train_and_validate self.train_or_valid(epoch_i=i, train_val_test='train') File "/workspace/solver_gan.py", line 338, in train_or_valid edges_logits, nodes_logits = self.G(z) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/workspace/models_gan.py", line 30, in forward nodes_logits = self.nodes_layer(output) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 94, in forward return F.linear(input, self.weight, self.bias) File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 1753, in linear return torch._C._nn.linear(input, weight, bias) (function _print_stack) Traceback (most recent call last): File "main_gan.py", line 65, in main(config) File "main_gan.py", line 58, in main solver.train_and_validate() File "/workspace/solver_gan.py", line 221, in train_and_validate self.train_or_valid(epoch_i=i, train_val_test='train') File "/workspace/solver_gan.py", line 385, in train_or_valid train_step_V.backward() File "/opt/conda/lib/python3.7/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/opt/conda/lib/python3.7/site-packages/torch/autograd/init.py", line 147, in backward allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512, 45]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

—————————————————————————————————— Can I ask how to solve it? Thank you very much.

Oct 09 '23 06:10 LiuJinzhe-Keepgoing

Hello! @LiuJinzhe-Keepgoing I found a solution which atleast lets me train the network (haven't seen final results yet): In solver_gan.py around line 385 as your error suggest. Replace:

            # Optimise generator.
            if cur_step % self.n_critic == 0:
                train_step_G.backward(retain_graph=True)
                self.g_optimizer.step()
 
            # Optimise value network.
            if cur_step % self.n_critic == 0:
                train_step_V.backward(retain_graph=True)
                self.v_optimizer.step()

With the following:

            if cur_step % self.n_critic == 0:
                train_step_G.backward(retain_graph=True)
                train_step_V.backward(retain_graph=True)
                self.g_optimizer.step()
                self.v_optimizer.step()

(Sorry for bad formatting this is the first time i respond to an issue). Hope this might help!

Solution found in the following thread: https://discuss.pytorch.org/t/solved-pytorch1-5-runtimeerror-one-of-the-variables-needed-for-gradient-computation-has-been-modified-by-an-inplace-operation/90256/34

Oct 12 '23 14:10 joelhil

Thank you, my good brother, joelhil. You have successfully solved my confusion. Thank you very much for the work you and your team have done. Best wishes and respect to you.

Oct 14 '23 12:10 LiuJinzhe-Keepgoing

Implementation-MolGAN-PyTorch Implementation-MolGAN-PyTorch copied to clipboard

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation:

Implementation-MolGAN-PyTorch
Implementation-MolGAN-PyTorch copied to clipboard