Dreamer_PyTorch
Dreamer_PyTorch copied to clipboard
Bug happened during the update of action model
Hi, thank you for you code, when I try to impletment the following codes in train.py # update action model (multiply -1 for gradient ascent) action_loss = -1 * (lambda_target_values.mean()) action_optimizer.zero_grad() action_loss.backward() clip_grad_norm(action_model.parameters(), args.clip_grad_norm) action_optimizer.step()_
bug happened: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [400, 1]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead.
Can you have a look at this?
Changing the order of codes (.backward(), .step()) as below solved the error (torch 1.8.0+cu111). In my knowledge, the error might be raised because the action_loss value can not be backpropagated after the "value_optimizer.step()" is run.
# update_value model
value_loss = 0.5 * mse_loss(imaginated_values, lambda_target_values.detach())
value_optimizer.zero_grad()
clip_grad_norm_(value_model.parameters(), args.clip_grad_norm)
# update action model (multiply -1 for gradient ascent)
action_loss = -1 * (lambda_target_values.mean())
action_optimizer.zero_grad()
clip_grad_norm_(action_model.parameters(), args.clip_grad_norm)
value_loss.backward(retain_graph=True)
action_loss.backward()
value_optimizer.step()
action_optimizer.step()