rlcard
rlcard copied to clipboard
In Mahjong game prediction, it appears that the order of state['current_hand'] influences the result of eval_step, what could be the reason?
found the root cause: in mahjong extract_state function the raw_legal_actions and legal_actions doesn't match, legal_actions is the unique list of player's hand, but raw_legal_actions is the list of player's hand