rlcard icon indicating copy to clipboard operation
rlcard copied to clipboard

Confusion about state representation--obs[53] in No-Limit Texas Hold'em

Open ArshartCloud opened this issue 2 years ago • 1 comments

In obs describiton, it says that :

53 | Chips that all players have put in

however, in line 70 of nolimitholdem.py

obs[53] = float(max(all_chips))

it use max rather then sum, which means the highest chip any player (however, if opponent chip is lower then you, the game ends, so it can only be opponents chip) put in the pot. Is that right?

ArshartCloud avatar Feb 24 '22 06:02 ArshartCloud

@ArshartCloud You are right. The state features are usually very important in training agents. The wrapper here is just an example, which is not necessarily the best. You can customize the env wrapper to do better feature engineering.

daochenzha avatar Mar 03 '22 19:03 daochenzha

I have a follow-up question about the observation space of limit-texas: Looks like the observation only contains disclosed cards, instead of distinguishing private cards and community cards. Isn't this representation a bit problematic when inferring other players' strategies?

cuijiaxun avatar Jan 16 '23 21:01 cuijiaxun

@cuijiaxun Yeah, that is true. The state space of Texas Hold'em is not carefully designed. I expect the agent will be much stronger if tuning state features, like what we have done in DouDizhu game. The state representation of AlphaHoldem could be borrowed here

daochenzha avatar Jan 17 '23 22:01 daochenzha