pytorch-ddpg icon indicating copy to clipboard operation
pytorch-ddpg copied to clipboard

The effect of NormalizedEnv

Open pengzhi1998 opened this issue 4 years ago • 2 comments

Hi, thank you for this great implementation!! However, I'm not very sure about the effect of normalized_env.py. Actually, if I remove it, the results seem to be worse than not removing it. What does it do? Look forward to your reply!

pengzhi1998 avatar Jan 21 '21 02:01 pengzhi1998

There are several levels for my answer:

  • Strangely, I had to change the method names from _action and _reverse_action to action and reverse_action for the code to work - maybe this has to do with the gym version (mine has version 0.18.0).
  • If you try adding print statements in the action and reverse_action methods, you would see that action is being called repeatedly but reverse_action is never called.
  • The reason why Gym has the ActionWrapper class and the action method is that: "Used to modify the actions passed to the environment." See reference at https://alexandervandekleut.github.io/gym-wrappers/. I'm actually not sure about the reverse_action method so I would welcome any other thoughts on this.
  • So basically the actor network would output something in the range of Tanh, but you can imagine an environment in which the range of possible actions is, for example, [2, 7].

zhihanyang2022 avatar Apr 15 '21 23:04 zhihanyang2022

Thank you for your response!

pengzhi1998 avatar Apr 16 '21 03:04 pengzhi1998