spinning-up-a-Pong-AI-with-deep-RL icon indicating copy to clipboard operation
spinning-up-a-Pong-AI-with-deep-RL copied to clipboard

Code for "Spinning Up a Pong AI With Deep RL" on FloydHub.

Results 2 spinning-up-a-Pong-AI-with-deep-RL issues
Sort by recently updated
recently updated
newest added

I trained your network (without logging) during 12000 iterations and I'm not seeing good results. The IA is definitely improving but its nowhere to close to beat the CPU consistently....

Hi, I am understanding policy gradient and trying to figure out why you didn't made use of log action probability and advantage function in calculating y_train?