pytorch-ddpg
pytorch-ddpg copied to clipboard
The effect of NormalizedEnv
Hi, thank you for this great implementation!! However, I'm not very sure about the effect of normalized_env.py. Actually, if I remove it, the results seem to be worse than not removing it. What does it do? Look forward to your reply!
There are several levels for my answer:
- Strangely, I had to change the method names from
_action
and_reverse_action
toaction
andreverse_action
for the code to work - maybe this has to do with the gym version (mine has version 0.18.0). - If you try adding print statements in the
action
andreverse_action
methods, you would see thataction
is being called repeatedly butreverse_action
is never called. - The reason why Gym has the ActionWrapper class and the
action
method is that: "Used to modify the actions passed to the environment." See reference at https://alexandervandekleut.github.io/gym-wrappers/. I'm actually not sure about thereverse_action
method so I would welcome any other thoughts on this. - So basically the actor network would output something in the range of Tanh, but you can imagine an environment in which the range of possible actions is, for example, [2, 7].
Thank you for your response!