rlcard icon indicating copy to clipboard operation
rlcard copied to clipboard

Probably a bug in mahjong/game.py step_back()

Open 196693 opened this issue 4 years ago • 2 comments

在 麻将的 Game.py 中 有 round, dealer 这俩对象, round当中又包含一个dealer,在init_game当中 先实例化了一个dealer ,然后把指针传给了 round 但是在开启了step_back之后,在step当中会用deepcopy存一下之前的内容,在step_back函数中,仅仅用了一个pop,此时的game.dealer 和 game.round.dealer 就对不上了。 所以在用cfrAgent的时候会出现错误。

目前我在step_back当中,简单改了一下

self.dealer, self.players, self.round = self.history.pop()
self.round.dealer = self.dealer
self.round.judger = self.judger

judger这个不知道有没有必要。不过正如另一个issue所说,麻将这里面有一些redundancy,优化完了就不用这样啦。

Plus: 这个CFR也太慢了

196693 avatar Dec 10 '20 08:12 196693

@daochenzha Just following up on this

jkterry1 avatar Feb 25 '21 21:02 jkterry1

I think in the step_back function, simply the line self.dealer, self.players, self.round = self.history.pop() is enough and the lines

self.round.dealer = self.dealer
self.round.judger = self.judger

are unnecessary because when we do the pop, we also get the previous state of self.round, which contains the previous state self.round.dealer and this should already be the same as the previous state of self.dealer. Since in each step function, self.round.dealer is updated as well self.dealer and they match, self.round.dealer contained in the deepcopy of self.round and deepcopy of self.dealer also match. Consequently using pop will get you those previous states and there will be no mismatch between self.dealer and self.round.dealer.

kaanozdogru avatar Apr 12 '21 00:04 kaanozdogru