open_spiel
open_spiel copied to clipboard
Add option to store all world states in information state tree
This PR adds a boolean flag to the constructor of the infostate tree to allow storing the corresponding world states and chance reach probability not just at the leaf nodes, but all infostate nodes (except filler nodes).
In particular, this change allows to attain an inverse mapping of info state to world states when traversing a player's infostate tree.
The changes are:
- added boolean flag to
InfostateTreeclass - change logic in building tree to account for the flag
- adopt info state tree tests to account for the change
- added
is_filler_node()check toInfostateNodeclass to query node - extended
child_atmethod ofInfostateNodeto choose to skip filler nodes viaboolflag - exchange
const std::string&withstd::string_viewin a couple of appropriate places ininfostate_tree.(h|cpp)
An example output of traversing player 0's infostate tree of 2-player Kuhn Poker (skipping filler nodes):
Child-Sequence: [] type: observation #children: 1 states: [] chance_reach_prob: []
Child-Sequence: [0] type: observation #children: 3 states: [] chance_reach_prob: ['1.000']
Child-Sequence: [0, 2] type: observation #children: 1 states: [2] chance_reach_prob: ['0.333']
Child-Sequence: [0, 2, 0] type: decision #children: 2 states: [2 0, 2 1] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 2, 0, 1] type: observation #children: 1 states: [2 0 b, 2 1 b] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 2, 0, 1, 0] type: observation #children: 4 states: [2 0 b, 2 1 b] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 2, 0, 1, 0, 3] type: terminal #children: 0 states: [2 1 bb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 1, 0, 2] type: terminal #children: 0 states: [2 1 bp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 1, 0, 1] type: terminal #children: 0 states: [2 0 bb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 1, 0, 0] type: terminal #children: 0 states: [2 0 bp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 0] type: observation #children: 1 states: [2 0 p, 2 1 p] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 2, 0, 0, 0] type: observation #children: 3 states: [2 0 p, 2 1 p] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 2, 0, 0, 0, 2] type: terminal #children: 0 states: [2 1 pp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 0, 0, 1] type: decision #children: 2 states: [2 0 pb, 2 1 pb] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 2, 0, 0, 0, 1, 1] type: observation #children: 2 states: [2 0 pbb, 2 1 pbb] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 2, 0, 0, 0, 1, 1, 1] type: terminal #children: 0 states: [2 1 pbb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 0, 0, 1, 1, 0] type: terminal #children: 0 states: [2 0 pbb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 0, 0, 1, 0] type: observation #children: 2 states: [2 0 pbp, 2 1 pbp] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 2, 0, 0, 0, 1, 0, 1] type: terminal #children: 0 states: [2 1 pbp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 0, 0, 1, 0, 0] type: terminal #children: 0 states: [2 0 pbp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 2, 0, 0, 0, 0] type: terminal #children: 0 states: [2 0 pp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1] type: observation #children: 1 states: [1] chance_reach_prob: ['0.333']
Child-Sequence: [0, 1, 0] type: decision #children: 2 states: [1 0, 1 2] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 1, 0, 1] type: observation #children: 1 states: [1 0 b, 1 2 b] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 1, 0, 1, 0] type: observation #children: 4 states: [1 0 b, 1 2 b] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 1, 0, 1, 0, 3] type: terminal #children: 0 states: [1 2 bb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 1, 0, 2] type: terminal #children: 0 states: [1 2 bp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 1, 0, 1] type: terminal #children: 0 states: [1 0 bb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 1, 0, 0] type: terminal #children: 0 states: [1 0 bp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 0] type: observation #children: 1 states: [1 0 p, 1 2 p] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 1, 0, 0, 0] type: observation #children: 3 states: [1 0 p, 1 2 p] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 1, 0, 0, 0, 2] type: terminal #children: 0 states: [1 2 pp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 0, 0, 1] type: decision #children: 2 states: [1 0 pb, 1 2 pb] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 1, 0, 0, 0, 1, 1] type: observation #children: 2 states: [1 0 pbb, 1 2 pbb] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 1, 0, 0, 0, 1, 1, 1] type: terminal #children: 0 states: [1 2 pbb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 0, 0, 1, 1, 0] type: terminal #children: 0 states: [1 0 pbb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 0, 0, 1, 0] type: observation #children: 2 states: [1 0 pbp, 1 2 pbp] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 1, 0, 0, 0, 1, 0, 1] type: terminal #children: 0 states: [1 2 pbp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 0, 0, 1, 0, 0] type: terminal #children: 0 states: [1 0 pbp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 1, 0, 0, 0, 0] type: terminal #children: 0 states: [1 0 pp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0] type: observation #children: 1 states: [0] chance_reach_prob: ['0.333']
Child-Sequence: [0, 0, 0] type: decision #children: 2 states: [0 1, 0 2] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 0, 0, 1] type: observation #children: 1 states: [0 1 b, 0 2 b] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 0, 0, 1, 0] type: observation #children: 4 states: [0 1 b, 0 2 b] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 0, 0, 1, 0, 3] type: terminal #children: 0 states: [0 2 bb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 1, 0, 2] type: terminal #children: 0 states: [0 2 bp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 1, 0, 1] type: terminal #children: 0 states: [0 1 bb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 1, 0, 0] type: terminal #children: 0 states: [0 1 bp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 0] type: observation #children: 1 states: [0 1 p, 0 2 p] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 0, 0, 0, 0] type: observation #children: 3 states: [0 1 p, 0 2 p] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 0, 0, 0, 0, 2] type: terminal #children: 0 states: [0 2 pp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 0, 0, 1] type: decision #children: 2 states: [0 1 pb, 0 2 pb] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 0, 0, 0, 0, 1, 1] type: observation #children: 2 states: [0 1 pbb, 0 2 pbb] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 0, 0, 0, 0, 1, 1, 1] type: terminal #children: 0 states: [0 2 pbb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 0, 0, 1, 1, 0] type: terminal #children: 0 states: [0 1 pbb] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 0, 0, 1, 0] type: observation #children: 2 states: [0 1 pbp, 0 2 pbp] chance_reach_prob: ['0.167', '0.167']
Child-Sequence: [0, 0, 0, 0, 0, 1, 0, 1] type: terminal #children: 0 states: [0 2 pbp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 0, 0, 1, 0, 0] type: terminal #children: 0 states: [0 1 pbp] chance_reach_prob: ['0.167']
Child-Sequence: [0, 0, 0, 0, 0, 0] type: terminal #children: 0 states: [0 1 pp] chance_reach_prob: ['0.167']