Dreamer_PyTorch icon indicating copy to clipboard operation
Dreamer_PyTorch copied to clipboard

Bug happened during the update of action model

Open TimHo0331 opened this issue 2 years ago • 1 comments

Hi, thank you for you code, when I try to impletment the following codes in train.py # update action model (multiply -1 for gradient ascent) action_loss = -1 * (lambda_target_values.mean()) action_optimizer.zero_grad() action_loss.backward() clip_grad_norm(action_model.parameters(), args.clip_grad_norm) action_optimizer.step()_

bug happened: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [400, 1]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead.

Can you have a look at this?

TimHo0331 avatar Mar 09 '22 16:03 TimHo0331

Changing the order of codes (.backward(), .step()) as below solved the error (torch 1.8.0+cu111). In my knowledge, the error might be raised because the action_loss value can not be backpropagated after the "value_optimizer.step()" is run.

        # update_value model
        value_loss = 0.5 * mse_loss(imaginated_values, lambda_target_values.detach())
        value_optimizer.zero_grad()
        clip_grad_norm_(value_model.parameters(), args.clip_grad_norm)

        # update action model (multiply -1 for gradient ascent)
        action_loss = -1 * (lambda_target_values.mean())
        action_optimizer.zero_grad()
        clip_grad_norm_(action_model.parameters(), args.clip_grad_norm)

        value_loss.backward(retain_graph=True)
        action_loss.backward()

        value_optimizer.step()
        action_optimizer.step()

hynkis avatar Sep 18 '22 12:09 hynkis