deep_learning_and_the_game_of_go icon indicating copy to clipboard operation
deep_learning_and_the_game_of_go copied to clipboard

Chapter 14: ZeroAgent runs out of moves

Open mdm opened this issue 4 years ago • 5 comments

Running zero_demo.py crashes with the following error:

Traceback (most recent call last):
  File "zero_demo.py", line 104, in <module>
    simulate_game(board_size, black_agent, c1, white_agent, c2)
  File "zero_demo.py", line 44, in simulate_game
    next_move = agents[game.next_player].select_move(game)
  File "/deep_learning_and_the_game_of_go/code/dlgo/zero/agent.py", line 96, in select_move
    next_move = self.select_branch(node)
  File "/deep_learning_and_the_game_of_go/code/dlgo/zero/agent.py", line 142, in select_branch
    return max(node.moves(), key=score_branch)             # <1>
ValueError: max() arg is an empty sequence

My guess is the agent runs out of legal moves before it realizes the game is over.

mdm avatar Apr 13 '20 12:04 mdm

I think I more or less found out what the problem is: When a round arrives at an existing node where the game is already over, there are no more legal moves (not even passing or resigning). But the code attempts to make a move in order to create a new child node. The code does not handle this situation. I also wonder how the algorithm is supposed to work in this situation, should the "newly discovered but not really" node be recorded by traversing the tree upwards (this could skew the results) or should this round be ignored (this could lead to selecting the same node over and over again)?

while node.has_child(next_move): node = node.get_child(next_move) next_move = self.select_branch(node) The above code does not assume that there could be zero legal moves (which is the case when the game is already over at that point).

DrVecctor avatar Jul 03 '20 17:07 DrVecctor

@maxpumperla @macfergus Get same problem, trying to fix it. Please review the code in zero_demo.py. Tks!

huynq55 avatar Dec 18 '20 08:12 huynq55

@HaodongLi1029 can you please stop spamming various issues here? In the other issue @macfergus advised to check out the chapter_14 branch, how about you start with that?

maxpumperla avatar Dec 22 '20 09:12 maxpumperla

Hello all, @DrVecctor was correct. The simplest fix is to make pass a legal move even after the game is over. This means the MCTS could continue to read out a branch after the end of the game, which is inefficient, but shouldn't really affect its decisions (once the value network has trained a bit)

Pull in this diff https://github.com/maxpumperla/deep_learning_and_the_game_of_go/commit/6148f57eb98e4c75b102d096401efe780e911442 to get the fix. This diff also makes goboard_fast consistent with goboard.py

macfergus avatar Dec 22 '20 16:12 macfergus

After applying the above fix, the error still appears. Should we make it return pass, if no more legal moves?

PS, It seems after adding pass turn as legal moves, the error disappeared.

arisliang avatar Dec 27 '21 07:12 arisliang