rlcard
rlcard copied to clipboard
Confusion about state representation--obs[53] in No-Limit Texas Hold'em
In obs describiton, it says that :
53 | Chips that all players have put in
however, in line 70 of nolimitholdem.py
obs[53] = float(max(all_chips))
it use max rather then sum, which means the highest chip any player (however, if opponent chip is lower then you, the game ends, so it can only be opponents chip) put in the pot. Is that right?
@ArshartCloud You are right. The state features are usually very important in training agents. The wrapper here is just an example, which is not necessarily the best. You can customize the env wrapper to do better feature engineering.
I have a follow-up question about the observation space of limit-texas: Looks like the observation only contains disclosed cards, instead of distinguishing private cards and community cards. Isn't this representation a bit problematic when inferring other players' strategies?
@cuijiaxun Yeah, that is true. The state space of Texas Hold'em is not carefully designed. I expect the agent will be much stronger if tuning state features, like what we have done in DouDizhu game. The state representation of AlphaHoldem could be borrowed here