Stoix icon indicating copy to clipboard operation
Stoix copied to clipboard

[FEATURE] Implement self-play for two-player zero-sum games

Open RPegoud opened this issue 7 months ago • 0 comments

Description: Add self-play versions of DQN and PPO for two-player zero-sum games in PGX environments.

Checklist:

  • [ ] Determine how to keep the value estimation consistent (e.g. flip the board or reverse the discount for opponent values)
  • [ ] Add PGX environment configs
  • [ ] Implement self-play for DQN
  • [ ] And for PPO
  • [ ] (optional) If possible, for AlphaZero

RPegoud avatar Jul 08 '24 13:07 RPegoud