AlphaZero_Gomoku icon indicating copy to clipboard operation
AlphaZero_Gomoku copied to clipboard

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Results 77 AlphaZero_Gomoku issues
Sort by recently updated
recently updated
newest added

我在导入自己用pytorch训练的6_6_4模型时遇到了如下报错,请问大佬该怎么解决呀,我感觉自己改一些地方改错了 root@autodl-container-464d11b752-0064c9d8:~/autodl-fs/gomoku/AlphaZero_Gomoku# python human_play.py Traceback (most recent call last): File "human_play.py", line 88, in run() File "human_play.py", line 60, in run best_policy = PolicyValueNet(width, height, model_file) File "/root/autodl-fs/gomoku/AlphaZero_Gomoku/policy_value_net_pytorch.py", line 78,...

Hi. I am a student working on a research regarding decision and prediction during a game. Due to the experimental setting, we'd like to change the board to an aymmetrical...

It seems the Keras implementation does not use policy gradient algorithm (reinforcement learning), instead it uses supervised learning. The Pytorch implementation uses reinforcement learning. Why?

作者您好,非常感谢您提供的这个非常棒的alpha go zero算法工程!感谢您能够百忙之中抽看去看我的问题:我使用pytroch去训练,发现loss值始终下降不下去,一直在6-7左右徘徊,不知到您是否有遇到过类似的问题?有什么解决的办法吗?(您在先前问题提到的explain_val=0的问题我也遇到过,但大部分情况都是有值的) 根据您往期的回答,您貌似是只在theano上完整训练过网络,我在使用该网络后得到了和您类似的loss曲线,所以我猜测是pytorch和theano的差异导致pytorch训练无法收敛。我将Theano训练出来的权重生成pytorch的权重文件拿来使用后,得到的结果仍然不理想,所以,是不是因为两者在前向传播的过程当中就存在不同? 我目前使用的版本是pytorch==1.12.0,cpu和gpu都试过,我还在pytroch==0.4.1上尝试过,但是还是没能解决问题。调试超参数貌似也不能很好的解决这个问题

怎么保存固定迭代次数的模型或者怎么看Elo等级分

如题,可以请问一下各位大佬,如果要训练不同大小的正方形棋盘,我应该怎么修改哪些部分的代码?