dzy1997

Results 2 issues of dzy1997

I am trying to adapt the tic-tac-toe example to train through self-play. I train an agent against a fixed agent (with the same network architecture) until the average win rate...

question

I am trying to run TDAN for the baseline of a competition, but has some problem setting up the environment. I tried to create an environment in conda: `conda create...