PufferLib
PufferLib copied to clipboard
TicTacToe Environment
I am getting 5.7M SPS in c implementation vs 2.7k in c implementation. Didn't tinker too much with the training parameters. It seems pretty good after 100 million steps.