rlcard Confusion about state representation--obs[53] in No-Limit Texas Hold'em

Confusion about state representation--obs[53] in No-Limit Texas Hold'em

Open ArshartCloud opened this issue 2 years ago • 1 comments

In obs describiton, it says that :

53 | Chips that all players have put in

obs[53] = float(max(all_chips))

it use max rather then sum, which means the highest chip any player (however, if opponent chip is lower then you, the game ends, so it can only be opponents chip) put in the pot. Is that right?

Feb 24 '22 06:02 ArshartCloud

@ArshartCloud You are right. The state features are usually very important in training agents. The wrapper here is just an example, which is not necessarily the best. You can customize the env wrapper to do better feature engineering.

Mar 03 '22 19:03 daochenzha

I have a follow-up question about the observation space of limit-texas: Looks like the observation only contains disclosed cards, instead of distinguishing private cards and community cards. Isn't this representation a bit problematic when inferring other players' strategies?

Jan 16 '23 21:01 cuijiaxun

@cuijiaxun Yeah, that is true. The state space of Texas Hold'em is not carefully designed. I expect the agent will be much stronger if tuning state features, like what we have done in DouDizhu game. The state representation of AlphaHoldem could be borrowed here

Jan 17 '23 22:01 daochenzha

rlcard rlcard copied to clipboard

Confusion about state representation--obs[53] in No-Limit Texas Hold'em

rlcard
rlcard copied to clipboard