Deep-rl-mxnet
Deep-rl-mxnet copied to clipboard
DDPG/TD3 action saturation
Hi,
I found the DDPG/TD3 algorithms can easily lead to the action saturation (to maximum value) when training with tasks with more than one action. I noticed that you had ever experienced this problem and discussed it with others in 2019. Thus, I would like to kindly ask if you have addressed the problem? Thank you very much!