rlcard
rlcard copied to clipboard
Question about score counting of blackjack
Hi, I found a possible problem about score counting within the blackjack environment.
https://github.com/datamllab/rlcard/blob/f256657dd13039bd707e6d95ea3c31795b573c76/rlcard/envs/blackjack.py#L38-L71
Consider the case that there are more than one 'Ace' in my_cards
(i.e. the player's hand cards) and the score
> 21, the get_scores_and_A
only deduct 10 from score once. For example, following is the state of one of such case.
{'obs': array([30, 4]),
'legal_actions': OrderedDict([(0, None), (1, None)]),
'raw_obs': {'player0 hand': ['S9', 'D2', 'HA', 'C7', 'CA'],
'dealer hand': ['C4'], 'actions': ('hit', 'stand'),
'state': (['S9', 'D2', 'HA', 'C7', 'CA'], ['C4'])},
'raw_legal_actions': ['hit', 'stand'],
'action_record': [(0, 'hit'), (0, 'hit'), (0, 'hit')]}
I think in above example, the player's sum should be 20, instead of 30. The problem appears when I was training my Monte Carlo Control agent (I'm a beginner of RL), perhaps it's my code's problem or I wrongly understand the game rule.
BTW, I hope you could include the usable ace
feature into blackjack environment, as well as the function of setting number of decks, as in Sutton and Barto's book Reinforcement Learning An Introduction. Or maybe I will do it and pull requests later.