rl
rl copied to clipboard
[Feature Request] MCTS Issue tracker
- [ ]
break_when_all_doneinenv.rollout()#2355 - [ ] Partial steps in env #2356
- [ ]
BatchedEnv: pass the indices of envs where a step should be done - [ ]
BatchedEnv: index a BatchedEnv - [ ] Stateless env: discard done parts of the TD
- [ ]
- [ ] Scores #2358
- [x] PUCT (missing tests)
- [x] UCB (missing tests)
- [ ] UCB1_TUNED
- [ ] EXP3
- [ ] PUCT_VARIANT
- [x]
TensorSpec.enumerate() - [x] Storage
- [x] Hashing functions #2304
- [x] Query modules #2305
- [x] Map #2306
- [x] MCTSForest #2307
- [ ] Policy classes #2359
- [ ] MCTSPolicyBase
- [ ] MCTSPolicy
- [ ] AlphaGoPolicy
- [ ] AlphaStarPolicy
- [ ] MuZeroPolicy
cc @dtsaras @mjlaali