AlphaZero_Gomoku icon indicating copy to clipboard operation
AlphaZero_Gomoku copied to clipboard

Why is the negative leaf_value for update_recursive function?

Open KelleyYin opened this issue 6 years ago • 7 comments

https://github.com/junxiaosong/AlphaZero_Gomoku/blob/68603c0d8e5a0ef9273bacc7d281abe27493da1b/mcts_alphaZero.py#L137

I can't absolutely understant the negative leaf_value, which is different with in the paper(AlphaGo Zero) Could you give a explaination for this? Thank you very much .

KelleyYin avatar Apr 20 '18 07:04 KelleyYin

I think the right answer is node.update_recursive(leaf_value) ?!!

xiaoyangzai avatar Apr 21 '18 06:04 xiaoyangzai

'cause parent node and current node belong to different player, and value assigned to each node is according to the player

GeneZC avatar Apr 21 '18 14:04 GeneZC

We use the negative value of the state, this is because alternate levels in the search tree are from the perspective of different players and the Q-values are in fact used by the parent node in select stage.

junxiaosong avatar Apr 22 '18 09:04 junxiaosong

但是在调用的时候传入的应该是 leaf_value 而不是 -leaf_value 啊,update_recursive 函数中的负号的含义很明确,但是这里感觉需要传入的是 leaf_value ??希望可以解释一下,这里看的不是很懂 @junxiaosong

gmftbyGMFTBY avatar Jul 02 '18 04:07 gmftbyGMFTBY

@gmftbyGMFTBY leaf_value是从leaf节点的视角考虑的,leaf value传入后是用来更新Q value的,而leaf节点的Q value是给它的父节点选择分支的时候用的,所以这个Q value是从父节点的视角出发的,所以leaf节点自身的leaf value和自身的Q value就是从相反的视角考虑的,所以传入时就加了负号。

junxiaosong avatar Jul 10 '18 12:07 junxiaosong

@gmftbyGMFTBY leaf_value是从leaf节点的视角考虑的,leaf value传入后是用来更新Q value的,而leaf节点的Q value是给它的父节点选择分支的时候用的,所以这个Q value是从父节点的视角出发的,所以leaf节点自身的leaf value和自身的Q value就是从相反的视角考虑的,所以传入时就加了负号。

为什么父节点视角 和 leaf 自身视角,需要一正一反?

lvsh2012 avatar Nov 18 '22 17:11 lvsh2012

@gmftbyGMFTBY leaf_value是从leaf节点的视角考虑的,leaf value传入后是用来更新Q value的,而leaf节点的Q value是给它的父节点选择分支的时候用的,所以这个Q value是从父节点的视角出发的,所以leaf节点自身的leaf value和自身的Q value就是从相反的视角考虑的,所以传入时就加了负号。

为什么父节点视角 和 leaf 自身视角,需要一正一反?

因为父节点和子节点是对立的两个玩家,其中任意一个的动作的受益都是对另一个的损害,所以子节点状态的好在父节点看来就是不好,就是零和博弈

nicehzj avatar Dec 14 '22 11:12 nicehzj