async-rl icon indicating copy to clipboard operation
async-rl copied to clipboard

Not sample efficient enough

Open muupan opened this issue 9 years ago • 4 comments

From Figure 6 in the paper, their A3C only needs 20 epochs (20 million steps) to achieve average scores of around 400 at Breakout. My current implementation needs more. 2016-05-08 18 10 18

muupan avatar May 08 '16 09:05 muupan

Following the authors' feedback, now it's only slightly worse than theirs.

muupan avatar May 10 '16 09:05 muupan

@muupan Thank you for sharing implementation and setting with great result!

Your wiki helps a lot, and I'm going to try your setting.

Let me ask you something not written in wiki.

  1. There is loss normalization code for when sequence terminated at the middle

https://github.com/muupan/async-rl/blob/master/a3c.py#L113-L118

Are you using this now?

  1. There is an action skipping code at ALE # initialize()

https://github.com/muupan/async-rl/blob/master/ale.py#L146-L149

What is this for?

And I'm going to adjust my parameter as written in your wiki. Thanks!!

miyosuda avatar May 10 '16 13:05 miyosuda

  1. No, I don't use it now.
  2. It is called "no-op max" in the Nature DQN paper. It adds some randomness to initial states.

muupan avatar May 10 '16 13:05 muupan

I see. Thank you!

miyosuda avatar May 10 '16 14:05 miyosuda