Brian Lee

Results 34 comments of Brian Lee

Try using 'randompolicy' as a option instead of 'policy'. Policy will take the most likely move, but randompolicy will select randomly, weighted by the NN's output probabilities.

What sort of commentary are you looking for? I don't necessarily want to be in the business of writing python tutorials or tensorflow tutorials, but stuff like MCTS or stuff...

A lot of it is already pretty well commented at the top of each file, as to how the whole thing is arranged. Is there something in particular that you're...

Can you give me an example of the sgf file that it's running into issues on? I suspect it's an sgf file that violates the standards, so having the file...

Oh.. ugh, this makes me sad. So, the SGF file should declare that its encoding is GB18030; I can't just assume it. Most western-generated SGFs assume UTF-8, so putting in...

One concrete idea: instead of selecting 2% flat from the last 50 generations, select 4%->0% over the last 50 generations, with some sort of exponentially decaying curve, and also make...

This general class of idea is called a 'baseline', and existing examples work by subtracting a baseline (I expected to win with 0.95 probability) from the eventual result (I won...

The word 'baseline' is what you'll want to search to actually find previous literature on the topic.

I see what you're saying. I think the baseline approach effectively does what you want. The gradients per example are linearly correlated to the final error, so if you train...

fwiw, instead adding additional examples to the stream, you can just drop them with probability (1-p) or something. Drawback is that you can't reuse that stream anymore since it hard-codes...