Bjarke Ebert
Results
2
issues of
Bjarke Ebert
Two ideas for improved training game data quality: (1) When doing exploration (using dirichlet noise, or temperature>0), just let the "losing side" (according to after-search root evaluation) do the exploration....
# Suggestion When extracting training data for policy training, skip lost positions. Lost positions are those that led to a loss in the training game, the actual outcome of the...