pytorch-ddpg The effect of NormalizedEnv

The effect of NormalizedEnv

Open pengzhi1998 opened this issue 4 years ago • 2 comments

Hi, thank you for this great implementation!! However, I'm not very sure about the effect of normalized_env.py. Actually, if I remove it, the results seem to be worse than not removing it. What does it do? Look forward to your reply!

Jan 21 '21 02:01 pengzhi1998

There are several levels for my answer:

Strangely, I had to change the method names from _action and _reverse_action to action and reverse_action for the code to work - maybe this has to do with the gym version (mine has version 0.18.0).
If you try adding print statements in the action and reverse_action methods, you would see that action is being called repeatedly but reverse_action is never called.
The reason why Gym has the ActionWrapper class and the action method is that: "Used to modify the actions passed to the environment." See reference at https://alexandervandekleut.github.io/gym-wrappers/. I'm actually not sure about the reverse_action method so I would welcome any other thoughts on this.
So basically the actor network would output something in the range of Tanh, but you can imagine an environment in which the range of possible actions is, for example, [2, 7].

Apr 15 '21 23:04 zhihanyang2022

Thank you for your response!

Apr 16 '21 03:04 pengzhi1998

pytorch-ddpg pytorch-ddpg copied to clipboard

The effect of NormalizedEnv

pytorch-ddpg
pytorch-ddpg copied to clipboard