rlpyt icon indicating copy to clipboard operation
rlpyt copied to clipboard

Breakout benchmarks

Open RexGLiu opened this issue 4 years ago • 2 comments

Hi, I'm trying to run the R2D1 asynchronous alternating code that came with rlpyt on breakout. I'm wondering if anyone's had success so far in replicating DeepMind's R2D2 benchmark on breakout with this? It's been several days, and performance seems to plateau at around ~400 whereas the benchmark is at ~800. I get the same outcome regardless of whether I use 30 or 100 CPUs on a cluster. I know the white paper mentioned that R2D1 doesn't reproduce the benchmark on some atari games, but no mention was made of breakout.

RexGLiu avatar Jul 04 '20 18:07 RexGLiu

Right, I don't recall that I ever ran Breakout, but I have previously had DQN implementations come back kind of low on that game, I'm not sure why.

One possibility is the question of whether to press FIRE before the 30 no-op start, changed in a recent PR: #158 Although I've done at least one test with this change and it didn't really affect any of the few games I tried, including Breakout.

Any news/findings would be appreciated!

astooke avatar Jul 07 '20 20:07 astooke

Hi @astooke, thanks for the reply! Basically, what I did was to take your launch_atari_r2d1_async_alt_gravitar.py script and change the game from gravitar to breakout. I also changed the number of workers to 32 and ran the script for a week. After 1 or 2 days, it reaches around 350-400 avg return and just stays there. DeepMind's r2d2 benchmark is at around 800. I then tried running it on a different machine where I could get 160 workers instead. The learning curves look pretty much the same, so increasing the number of workers doesn't seem to do much for this game. So those are pretty much the findings I have at the moment. I'll let you know if I find out anything more.

RexGLiu avatar Jul 07 '20 20:07 RexGLiu