Some evaluation results are missing
In scores.txt of the current uploaded trained model, evaluation results at 55000000 and 56000000 are missing.
https://github.com/muupan/async-rl/blob/0ec501c36b6cdbe3888d3a6fc0043e4bc6c2cba3/trained_model/breakout/scores.txt#L55
I don't know why and whether it can affect performance. I need to check.
I found that missing evaluation is caused by processes stuck in evaluate_performance(). It is possible that some policies fail start to play Breakout, preventing episodes from being terminated. If so, it might be necessary to use epsilon-greedy-like action selection in addition to sampling from softmax policies in test runs.
It didn't occurred for Space Invaders. For Breakout we might need to force long episodes to finish.