dopamine [Question] Result better than baseline

[Question] Result better than baseline

Open Seraphli opened this issue 6 years ago • 9 comments

I run some tests on dopamine. When I run the game 'Qbert', the result seems better than the result presented in the baseline. I use here config located in dopamine/agents/rainbow/configs/rainbow.gin. I only change the environment name to Qbert.

Feb 18 '19 06:02 Seraphli

That does seem a little better. Our baseline results are reporting 1) training scores, 2) with sticky actions. Are you using 2)?

Feb 18 '19 15:02 mgbellemare

I use here config located in dopamine/agents/rainbow/configs/rainbow.gin. So the configuration seems with sticky actions.

Feb 18 '19 16:02 Seraphli

Strange. Maybe the code is getting better as it ages? :)

Is the x axis on your tensorboard million agent steps, or million frames (x4 steps)? The numbers would match the former.

Feb 18 '19 21:02 mgbellemare

I didn't change any training code, nor tensorboard summary code. So I think it matches the baseline. BTW, I run the training for Pong before and the score didn't reach 15 as the baseline showed. I didn't know the reason, so I run rainbow for the game Qbert.

Feb 19 '19 01:02 Seraphli

@Seraphli, Were you using a GPU to run the code and how long did it take to get the above results?

Feb 20 '19 04:02 satyakesav

@satyakesav 1d13h for first 100 iterations on Qbert

Feb 20 '19 04:02 Seraphli

@Seraphli, Ahh. I see. What GPU did you train it on?

Feb 20 '19 05:02 satyakesav

@satyakesav 1080ti

Feb 20 '19 05:02 Seraphli

@satyakesav 1d13h for first 100 iterations on Qbert

That is really fast, normally for DQN, one iteration takes 40 minutes on V100.

Aug 02 '20 07:08 GoingMyWay

dopamine dopamine copied to clipboard

[Question] Result better than baseline

dopamine
dopamine copied to clipboard