AlphaZero_Gomoku icon indicating copy to clipboard operation
AlphaZero_Gomoku copied to clipboard

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Results 77 AlphaZero_Gomoku issues
Sort by recently updated
recently updated
newest added

想请教一下,alpha zero是如何避免在不可行的位置落子的,比如该位置已经被占了,因为mcts在select的时候,每一个动作的概率是跟policy的输出有关,而在一开始的时候,policy是不知道哪些位置可行,哪些不可行,这样是否会产生不可行的动作?

In `game.py`, line 62-72: ```python def current_state(self): .... square_state[0][move_curr // self.width, move_curr % self.height] = 1.0 .... ``` I assume you may actually mean `[move_curr // self.width, move_curr % self.width]`....

Hi, could you please explain what is the difference between "best_policy_6_6_4.model" and "best_policy_6_6_4.model2"? (I assume it is the same explanation with 8_8_5 models). Thank you.

作者你好呀,每次从data_buffer 里取数据后不清空data_buffer,训练到data_buffer 满的时候越到后面新数据越少,随机取data_buffer的数据到mini_batch里的话新数据只占旧数据5%左右,这样做神经网络最新反馈的结果就无法出现到最新的训练集里,但训练结果却依然是在变好,这是什么原因啊,谢谢

https://github.com/junxiaosong/AlphaZero_Gomoku/blob/66292c55cc53acfae7f7bc5a15a370571549bdd9/mcts_alphaZero.py#L206

您好,注意到代码中有通过比较新旧两个神经网络输出的KL散度来控制学习率的方法,实验过程中学习率先快速增加然后逐渐减少,说明这个方法确实有用。想问一下这种方法有相关的文献资料的介绍吗?还是您凭经验创造出来的呢?

Traceback (most recent call last): File "human_play.py", line 88, in run() File "human_play.py", line 81, in run game.start_play(human, mcts_player, start_player=0, is_shown=1) File "C:\Users\lucky\Desktop\AlphaZero_Gomoku-master\AlphaZero_Gomoku-master\game.py", line 177, in start_play move = player_in_turn.get_action(self.board)...

When I unpickle ORENIST.data downloaded by command as 'git clone https://github.com/enakai00/jupyter_tfbook' using the following command. "f=open('../common/ORENIST.data','rb') images,labels=cPickle.load(f)" I have the following error. "ModuleNotFoundError: No module named 'numpy.core.multiarray\r'" I thought that...

play_data = list(play_data)[:] vs play_data = list(play_data) python2.7: list(play_data)[:] is fine python3.6: list(play_data)[:] is []

Do you have model of 15x15 board? how long will it take to train?