Ege Özsoy comments

Results 14 comments of


                                            Ege Özsoy

Loading LibTorch AlphaZero checkpoints in Python

Do you mean something other than https://github.com/deepmind/open_spiel/blob/master/open_spiel/examples/alpha_zero_torch_game_example.cc?

AlphaZero for Backgammon

I am using the pure python based version (https://github.com/deepmind/open_spiel/tree/master/open_spiel/python/algorithms/alpha_zero) I see some reasons for why it does not work out of the box. Firstly it flat out rejects using non...

AlphaZero for Backgammon

Yeah sure I could give it a try, though I would be only fixing the python version in this case right?

AlphaZero for Backgammon

I have been thinking if there is also a significant limitation here https://github.com/deepmind/open_spiel/blob/f522d174a1e2e8fbaf2007294985869d6e520669/open_spiel/python/algorithms/mcts.py#L311 Legal actions, and therefore children nodes are computed only once per node, but in a non-deterministic game...

AlphaZero for Backgammon

> Scratch that. There is a better way. We should add an option to MCTS to never stop expanding at chance nodes (ie. keep expanding the tree if the newly...

AlphaZero for Backgammon

Side question: Should the code simply run on the GPU if tensorflow with gpu support is installed? Or is this not currently supported?

AlphaZero for Backgammon

I gave it a shot here https://github.com/deepmind/open_spiel/pull/776. Please let me know

AlphaZero for Backgammon

It is definitely running for me. I just tried it for pig, and I got the following output. Not sure how I should interpret loss only falling at the beginning...

AlphaZero for Backgammon

I played against it after a while but it was hard to judge how good it was as I am not familiar with this game. But I let it play...

AlphaZero for Backgammon

To clarify, I am using the alpha_zero.py example script (https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/examples/alpha_zero.py). But I changed connect_four to pig, max_simulations to 30 from 300 and network type to MLP with 2 layers and...