Bjarke Ebert comments

Results 8 comments of


                                            Bjarke Ebert

Have a separate test and training window folder for training.

A related issue is that all games from a period are correlated / biased, since they are produced by the same ID and same search algo (although randomized). So even...

Have a separate test and training window folder for training.

Sure, the loss on older validation games is expected to be bigger than on more recent validation games. But not by *a lot*. So the monitoring could be used to...

Have a separate test and training window folder for training.

I find it strange that each game is used in multiple different training sessions. But given that we do, here's a modification of the idea of this issue: In each...

Don't use losing positions to train policy

I am well aware of the recursive aspect of letting the network learn what a 800-node search would find :) > Because those 800 nodes will be searching for that...

Don't use losing positions to train policy

Right, there's a noisy aspect of the final outcome. But that should help us: It will promote some moves out of lost positions, proportionally to how close those moves are...

Don't use losing positions to train policy

I think the training is not the bottleneck here, rather the game generation is, right? So why not just fork a network, and train it in parallel. Maybe I'll just...

Don't use losing positions to train policy

Another way to express my proposal is using learning weights for the policy loss SGD: Three different weights depending on game outcome loss, draw, win. Currently those are 1, 1,...

Exploration in training games

#551 seems very related Maybe my #604 is a dupe