catanatron
catanatron copied to clipboard
Implement Card Counting
Keeping track of your enemies' cards is usually helpful. Implementing this could:
- Make Catanatron stronger.
- Allow this project be part of https://github.com/google-deepmind/open_spiel (since for that we would have to be able to explicitly create "Chance Nodes" in the tree).
- Improve the fact that right now, the AlphaBetaPlayer when searching a tree might imagine impossible "stealing card" outcomes (see: https://github.com/bcollazo/catanatron/blob/master/catanatron_experimental/catanatron_experimental/machine_learning/players/tree_search_utils.py#L107), but with perfect card counting it sounds like this might not be needed (and the player can keep a perfect distribution of the chances of stealing a card from any given enemy).
Would be cool to have this "card counting capability" be parametrized by a number K
that represents how many cards from its enemy can a player keep "in its mind". I myself as a player, can probably just keep the last 2 - 3 cards of my enemies hands 😅 .
There is a question whether to implement this as part of the Game, or as a capability in the Players. Intuitively, sounds like it should be the Player (since some may have the capabilities, but others not); but I am not sure how the API would like. Do we have to now feed every "tick" to every player(?) Do we make it so that the player has a pointer to the log of ticks/actions and knows how many it has "consumed", so that the next time it is its turn, he "updates" its card counting distribution/believe? In that later approach, we would have to make sure reading the log doesn't give the player more information than what he should have.
This sounds like could be developed as a fairly independent module (say counter = CardCounter(k=5)
and counter.update(...)
), that players may or may not use in their API.
When playerA steals from playerB, playerB should be the only player aware of that card in playerA's hand. playerC would know playerA's total hand, but would have to leave an unknown card in it's representation of playerA's hand. Giving AlphaBetaPlayer perfect knowledge would be an unfair advantage, so maybe each player needs their own card counting data to be updated. I'm just ramping up, but it sounds tricky to give this ability to the players, since they would need to listen into each action and avoid updating their knowledge in many different cases.
Also, what if the UI tracked the known and unknown cards in the other players' hands on behalf of the player. That would be neat
Shouldn't it be possible to moderate this through the Game
class? Each player has a list of cards he remembers, and whenever a steal happens the game object updates the list accordingly.
@zarns I agree that perfect knowledge would be an unfair advantage. We shouldn't do that. I am also ok if by however we implement this, it is done with an "honor-system" (no real language/framework checks). Since these checks usually lower the speed at which games run, and our goal here is to research/find best Catan Player, so just being disciplined (since we can all review the open-source code) is good enough for me. If there's a nice way to add the checks / impossibility to know more than what players can, without impacting speed noticeably, then even better; I'll take it.
@HassanJbara it might. But I'm not sure we want to add the responsibility to the Game
class, since maintaining the state for all players would slow all games (even between players that don't count cards). If you see another way, let me know!
For now, the best I see a CardCountingModule
that can be used like this:
class MyPlayer(Player):
def __init__(self):
self.counter = CardCountingModule(self.color, k=10)
self.last_action_seen = 0 # or so...
def decide(self, game, playable_actions):
actions_not_consumed = game.state.actions[self.last_action_seen:]
self.last_action_seen = len(game.state.actions)
self.counter.update(actions_not_consumed)
# then the user can consult the counter with say
believed_enemy_hand = self.counter.believed_hand(enemy_color)
Indexing math might be off.. but you get the point! 👍
I can look into this after finishing the other issue, I don't see why it shouldn't be doable. One thing I don't quite understand, though, is the k parameter. Is the idea here that a player that remembers all cards drawn, even though that's technically not cheating, unfair? Or am I getting this wrong?
This is a LOT of the post-second-move gameplay. You have an Attention to public actions i.e. draws/plays/trades (someone's K as a limit, above). You could model what your player 'believes the other player has' using K and a confidence score taking into account public plays since unknowns (steals) or the like, BUT are you trying to model human behavior or build a better bot? Give the bots ALL the observable actions.