alpha-zero-general ZeroDivisionError: float division by zero

Hey there,

First of all thank you for sharing this repo with the public. Really cool project!

Rearding the opened issue: I experience currently a problem with MCTS in def getActionProb, namely the error ZeroDivisionError: float division by zero. It originates from probs = [x / counts_sum for x in counts] and the problem is that all the elements in counts are 0 because the state/action pair has not been discovered and saved in self.Nsa yet (counts = [self.Nsa[(s, a)] if (s, a) in self.Nsa else 0 for a in range(self.game.getActionSize())]).

Any ideas what might be causing the issue & how to fix it? I see in the other projects (e.g Othello) no such problems, so I am wondering what might be culprit here. I am currently trying to make the game of hex work in the project.

Cheers

Jan 06 '23 00:01 visuallization

#191 might help

Jan 06 '23 07:01 goshawk22

Yes, thanks for pointing this out. However I already dicovered this one and I couldn't find anything similiar in my code. I made sure that I always copy the pieces in the game code. It just seems that mcts is not exploring edges which the players in the arena then want to play. But I guess it should only play discovered moves (s/a) in the arena since it uses greedy search, right?

Jan 06 '23 09:01 visuallization

Could you share your code?

Jan 06 '23 14:01 goshawk22

Could you share your code?

I gladly do: https://github.com/visuallization/alpha-zero-general-hex/tree/master/hex The relevant files are HexGame.py and HexBoard.py. Furthermore it currently uses the NeuralNet architecture from Othello.

Jan 06 '23 16:01 visuallization

I am not working with pass though, like Othello does because there is no pass in Hex.

Jan 06 '23 16:01 visuallization

Okay there might be an issue how I represent the canonicalBoard in getCanonicalForm. If I just return the board without inverting it, the issue of not finding the state in self.Nsa does not arise anymore.

def getCanonicalForm(self, positions, player):
        board = HexBoard(self.size)
        board.positions = np.copy(positions)
        return board.positions
        #return player * board.positions

Jan 06 '23 18:01 visuallization

@visuallization did you ever figure this out?

I believe from looking at your fork that this issue was resolved, and is a dupe of https://github.com/suragnair/alpha-zero-general/issues/191

Mar 14 '23 07:03 jamesbraza

alpha-zero-general alpha-zero-general copied to clipboard

ZeroDivisionError: float division by zero

alpha-zero-general
alpha-zero-general copied to clipboard