open_spiel
open_spiel copied to clipboard
How to solve the exploitability of Large Game
In open_spiel/python/example/deep_cfr, the tabular method is usually used to solve the exploitability or NashConv of kuhn poker. """ average_policy = policy.tabular_policy_from_callable(game, deep_cfr_solver.action_probabilities) conv = exploitability.nash_conv(game, average_policy) """ Can we find a good way to calculate the exploitability when the scale of game is large? Thank you for your reply!
The best method I know about is ABR by Timbers et al., but there is no implementation in OpenSpiel and it requires a lot of computational resources. This paper will be presented by Nolan Bard at IJCAI later this month. There is a tabular implementation in https://github.com/deepmind/open_spiel/pull/852 by @rezunli96 that could be adapted to function approximation.
The next closest thing is to run some standard RL algorithm against the fixed opponents. There is an example which does that using DQN here: https://github.com/deepmind/open_spiel/blob/master/open_spiel/python/examples/rl_response.py. The problem, of course, is that DQN might not be able to find any exploits and if not then you can't say much about exploitability given that it's a lower-bound. But it's still something!
In general, this is an active research area and the problem is largely unsolved.. but we're making some progress!
See also https://github.com/deepmind/open_spiel/issues/772
Wow, I will take a look at these knowledge points immediately. And if there are any questions, I will leave messages under this column.
RL methods to evaluate exploitability seems to be impractical in multi-player extensive games. Maybe it's a more difficult problem for large game more than 2 players ?
Related: #519