Yuandong Tian

Results 27 comments of Yuandong Tian

Very good analysis! The intuition is that a forced even distribution of black/white win implicitly uses a posterior value for the value estimate. Note that the value function in OpenGo...

On the other hand, I am not sure whether this can affect black rollout efficiency. It might be the case that since black's Q is biased, more rollouts do not...

ah, ic. Yes, the final trained value function is actually not a proper posterior. So the transformation is nonlinear and thus is Q (or E) dependent. Applying the transformation back...

The analysis on black rollout efficiency would be dependent on where the optimal p_uct is. It is hard to draw a conclusion here. We might need to correct the value...

Thanks for your comments. The bottleneck is indeed in evaluation. In the sync version, we have to wait until all 400 games have been evaluated, and then decide whether we...

When there is a new model, all the previous selfplay games with the previous models need to be discarded.

Ok. I will take a look at this issue. Did you change BOARDSIZE to be 9? `q_min` minimal size of the replay buffer before starting training. `q_max` maximal size of...

@zchen0211 who is in charge of windows executable.

This is probably because of the version of PyTorch. A fix is on the way.