dopamine Value of Epsilon Decay Period

In the TF version of DQN, the value of epsilon_decay_period is set to 1M steps (see here), and for Rainbow, the value is set to 250k steps (see here).

However, the Rainbow paper says they anneal to 4M frames (i.e. 1M steps) for DQN (as done in Dopamine above), and importantly without Noisy Nets (which is the case with TF Rainbow), they anneal in the first 250K frames (and not steps, which would be 62500 steps with standard frame skipping of 4).

Is there a discrepancy here (Rainbow should anneal within 62k steps and not 250k steps), or am I misunderstanding something (or perhaps it really doesn't matter?). Thank you for your time.

Screenshot of page 4 of Rainbow paper

Nov 11 '22 10:11 rfali

Also, for the JAX Full Rainbow agent (which has Noisy Nets), and when using Noisy Nets, epsilon greedy is disabled (as in paper snippet above, as well as some other implementations like Kaixhin Rainbow here and here). However, I still see the epsilon_train set to 0.01 in JAX Full Rainbow (here) and if Noisy is true, the identity_epsilon function is called which just returns the epsilon value (but doesn't uses 0).

Nov 11 '22 10:11 rfali

thank you for pointing this out! this has been fixed here: https://github.com/google/dopamine/commit/ed92c57bd547db68d63aabee383d4c55756a6a0f

Nov 28 '22 16:11 psc-g

Thanks! As for

Is there a Is there a discrepancy here (Rainbow should anneal within 62k steps and not 250k steps), or am I misunderstanding something (or perhaps it really doesn't matter?)

Should the epsilon_decay_period value for TF Rainbow (which does not use Noisy Nets) be 250k frames as in the Rainbow paper (which makes it 62500 steps with frame_skip=4) or 250k steps (as in current implementation) or perhaps it does not matter)? I have rarely seen a value as low as 62500 steps for epsilon decay, for example RLlib also uses 200k for its DQN variant and epislon greedy exploration is off when using Noisy Nets.

Nov 29 '22 03:11 rfali

dopamine dopamine copied to clipboard

Value of Epsilon Decay Period

dopamine
dopamine copied to clipboard