MyAlphaGoZeroOnConnect4 Question about the network used

Hello

Can you tell me what about the model your using. How many conv-layers, residual layers etc.

Mar 03 '19 16:03 lijas

If I remember correctly, the network has 1 input layer and followed by 5 mid layers. Each of these mid layer uses both convolution and residual technique. Then, the network is splitted into 2 branches - value head and policy head.

Mar 03 '19 23:03 yichen914

Ok thank you. And a question about the tempeture-parameter. For how many moves do you use tau=1 before changing to tau=0? Is the tempeture only applied during self play, or also when playing real games?

Mar 04 '19 06:03 lijas

Re: how many moves before changing tau from 1 to 0, I think it is based on your experience. In my code, I set it to 10 steps. The temperature is for balancing the exploration and exploitation. When we train network (self play) we need the network to do enough exploration before it dives into a certain path. But when we compare or test the network, we assume that the path the network takes is the "best" (so far), so we don't need to set the temperature to 1.

Mar 04 '19 22:03 yichen914