deep_learning_and_the_game_of_go
deep_learning_and_the_game_of_go copied to clipboard
Chapter 14: ZeroAgent runs out of moves
Running zero_demo.py
crashes with the following error:
Traceback (most recent call last):
File "zero_demo.py", line 104, in <module>
simulate_game(board_size, black_agent, c1, white_agent, c2)
File "zero_demo.py", line 44, in simulate_game
next_move = agents[game.next_player].select_move(game)
File "/deep_learning_and_the_game_of_go/code/dlgo/zero/agent.py", line 96, in select_move
next_move = self.select_branch(node)
File "/deep_learning_and_the_game_of_go/code/dlgo/zero/agent.py", line 142, in select_branch
return max(node.moves(), key=score_branch) # <1>
ValueError: max() arg is an empty sequence
My guess is the agent runs out of legal moves before it realizes the game is over.
I think I more or less found out what the problem is: When a round arrives at an existing node where the game is already over, there are no more legal moves (not even passing or resigning). But the code attempts to make a move in order to create a new child node. The code does not handle this situation. I also wonder how the algorithm is supposed to work in this situation, should the "newly discovered but not really" node be recorded by traversing the tree upwards (this could skew the results) or should this round be ignored (this could lead to selecting the same node over and over again)?
while node.has_child(next_move): node = node.get_child(next_move) next_move = self.select_branch(node)
The above code does not assume that there could be zero legal moves (which is the case when the game is already over at that point).
@maxpumperla @macfergus Get same problem, trying to fix it. Please review the code in zero_demo.py. Tks!
@HaodongLi1029 can you please stop spamming various issues here? In the other issue @macfergus advised to check out the chapter_14
branch, how about you start with that?
Hello all, @DrVecctor was correct. The simplest fix is to make pass a legal move even after the game is over. This means the MCTS could continue to read out a branch after the end of the game, which is inefficient, but shouldn't really affect its decisions (once the value network has trained a bit)
Pull in this diff https://github.com/maxpumperla/deep_learning_and_the_game_of_go/commit/6148f57eb98e4c75b102d096401efe780e911442 to get the fix. This diff also makes goboard_fast
consistent with goboard.py
After applying the above fix, the error still appears. Should we make it return pass, if no more legal moves?
PS, It seems after adding pass turn as legal moves, the error disappeared.