dopamine icon indicating copy to clipboard operation
dopamine copied to clipboard

[Question] Result better than baseline

Open Seraphli opened this issue 6 years ago • 9 comments

I run some tests on dopamine. When I run the game 'Qbert', the result seems better than the result presented in the baseline. I use here config located in dopamine/agents/rainbow/configs/rainbow.gin. I only change the environment name to Qbert.

image

Seraphli avatar Feb 18 '19 06:02 Seraphli

That does seem a little better. Our baseline results are reporting 1) training scores, 2) with sticky actions. Are you using 2)?

mgbellemare avatar Feb 18 '19 15:02 mgbellemare

I use here config located in dopamine/agents/rainbow/configs/rainbow.gin. So the configuration seems with sticky actions.

Seraphli avatar Feb 18 '19 16:02 Seraphli

Strange. Maybe the code is getting better as it ages? :)

Is the x axis on your tensorboard million agent steps, or million frames (x4 steps)? The numbers would match the former.

mgbellemare avatar Feb 18 '19 21:02 mgbellemare

I didn't change any training code, nor tensorboard summary code. So I think it matches the baseline. BTW, I run the training for Pong before and the score didn't reach 15 as the baseline showed. I didn't know the reason, so I run rainbow for the game Qbert. image

Seraphli avatar Feb 19 '19 01:02 Seraphli

@Seraphli, Were you using a GPU to run the code and how long did it take to get the above results?

satyakesav avatar Feb 20 '19 04:02 satyakesav

@satyakesav 1d13h for first 100 iterations on Qbert

Seraphli avatar Feb 20 '19 04:02 Seraphli

@Seraphli, Ahh. I see. What GPU did you train it on?

satyakesav avatar Feb 20 '19 05:02 satyakesav

@satyakesav 1080ti

Seraphli avatar Feb 20 '19 05:02 Seraphli

@satyakesav 1d13h for first 100 iterations on Qbert

That is really fast, normally for DQN, one iteration takes 40 minutes on V100.

GoingMyWay avatar Aug 02 '20 07:08 GoingMyWay