deep-rl
deep-rl copied to clipboard
why is DDPG so unstable?
I can train a good agent, but the learning curve is quite noisy. why? is it an implementation issue or something intrinsic to DDPG?
@DontGiveUpEasily I trained it with 10k episode with the code, but the result is
Looks like it did not converge. Does the code work?
@DontGiveUpEasily See my comment here: https://github.com/pemami4911/deep-rl/issues/2#issuecomment-400929047
Ideally, the OU noise needs to be decayed so that the actions don't have noise added after convergence