Weirui Ye
Weirui Ye
Each game needs to train an agent separately. The dynamics function takes in the actions, and the policy function outputs the actions. Since different environments have different actions and the...
Hi, I noticed that you have changed "(image_channel,96,96) to (image_channel,7,7)". Is the observation shape of your environment 7x7? Noticed that there exists a downsampling network in model.py, which tries to...
Yes, it is coming soon. Thank you for your attention.
Thank you for your correction. I think it should be a bug. Except for the observation history, all the other statistics (eg, visits, values, rewards) should be indexed from 0...
Thank you for your correction! You are right. It is a bug that results in wrong min/max values on the tree side. Really thank you for your detailed reading. And...
Yes, it is coming soon. Thank you for your attention.
Thank you for your comments. The identity connection here follows the same architecture of resnets. The residual part provides richer and better gradients when the network is deep. Considering the...