PokerRL Observationspace/Infostate

Observationspace/Infostate

Open ILikePoker opened this issue 4 years ago • 1 comments

Hey Eric,

thanks for making this public, haven't found a good env so far that implements Multiplayer NL. Am I understanding the code right that the observationspace isn't actually perfect information? E.g. it is only the last couple actions? Do you have any research on how this affects convergence? I had a bit of trouble understanding the code, so I apologize if I just didn't read it right.

Nov 26 '20 00:11 ILikePoker

Heyo,

You're welcome! I've written a wrapper that tracks the action history for limit games. For no-limit games, this piece is slightly less obvious, so I would default to the recurrent option that tracks the history of public observations, which is also sound. So, perfect recall is still supported for NL through recurrent NNs but you did correctly spot that action history is only tracked through that and not explicitly.

If you wish to track it explicitly, you could just track it manually in the training code or write an appropriate "Wrapper" for the NL environment. However, this will require you to adjust the NN observation space accordingly. TL;DR: Recurrent is cleaner and more scalable.

Cheers, Eric

Dec 25 '20 13:12 EricSteinberger

PokerRL PokerRL copied to clipboard

Observationspace/Infostate

PokerRL
PokerRL copied to clipboard