Some evaluation results are missing

Open muupan opened this issue 9 years ago • 2 comments

In scores.txt of the current uploaded trained model, evaluation results at 55000000 and 56000000 are missing.

https://github.com/muupan/async-rl/blob/0ec501c36b6cdbe3888d3a6fc0043e4bc6c2cba3/trained_model/breakout/scores.txt#L55

I don't know why and whether it can affect performance. I need to check.

May 11 '16 02:05 muupan

I found that missing evaluation is caused by processes stuck in evaluate_performance(). It is possible that some policies fail start to play Breakout, preventing episodes from being terminated. If so, it might be necessary to use epsilon-greedy-like action selection in addition to sampling from softmax policies in test runs.

May 15 '16 03:05 muupan

It didn't occurred for Space Invaders. For Breakout we might need to force long episodes to finish.

May 17 '16 07:05 muupan