Stoix icon indicating copy to clipboard operation
Stoix copied to clipboard

[FEATURE] Implement self-play for two-player zero-sum games

Open RPegoud opened this issue 7 months ago • 2 comments

Issue: #99

Description: Add self-play versions of DQN and PPO for two-player zero-sum games in PGX environments.

Checklist:

  • [x] Determine how to keep the value estimation consistent (e.g. flip the board, use a negative discount)
  • [x] Add PGX environment configs
  • [ ] Implement self-play for DQN
  • [ ] And for PPO
  • [ ] (optional) If possible, for AlphaZero

RPegoud avatar Jul 15 '24 14:07 RPegoud