galvanise_zero icon indicating copy to clipboard operation
galvanise_zero copied to clipboard

Stuck training in server.py

Open zhaoqxu-eth opened this issue 5 years ago • 3 comments

I have installed all of the prerequisites. However, when I started running python server.py hex.conf command for training models, it's stuck after the following two lines:

2019-05-18 20:47:53,750592 [INFO    ]  checking if generation data available
2019-05-18 20:47:53,750715 [INFO    ]  Not such file for generation: [Errno 2] No such file or directory: u'/root/ggpzero/galvanise_zero/data/hex/d2/gendata_hex_0.json.gz'

Could you please give me any guidance?

zhaoqxu-eth avatar May 18 '19 21:05 zhaoqxu-eth

That's ok. Just means no self play data exists yet for that generation. Once train for a while will save the file in that location. If you restart the server will pick up from last point it was saved.

richemslie avatar May 18 '19 21:05 richemslie

I don't know if something is wrong with my installation. But there isn't any error displaying and just continually output the same sentence every ten minutes. It seems they're not training and no self-play data is stored.

2019-05-19 09:11:47,561940 [VERBOSE ]  entering checkpoint with 0 sample accumulated
2019-05-19 09:21:47,661234 [VERBOSE ]  entering checkpoint with 0 sample accumulated
2019-05-19 09:31:47,760546 [VERBOSE ]  entering checkpoint with 0 sample accumulated

zhaoqxu-eth avatar May 19 '19 09:05 zhaoqxu-eth

Hi. Bear with me, I am trying to write some docs to get you going.

In the meantime I pushed a revamped test in src/test/player/test_player.py.

The first thing to do after installing should be to run these tests. (and I can use the below for docs!)

First you'll need to checkout gzero_data repo. The test will be using breakthroughSmall, you can copy the rulesheet into ggplib/data/rulesheets. And test with python perftest.py breakthroughSmall in ggplib/src/ggplib/scripts

Then copy the breakthroughSmall directory in gzero_data into the data directory to the galavanise_zero repo.

Then can run the 3 tests in src/test/player/test_player.py

$ py.test test_player.py -s -k test_random This will test a random neural network against a simplemcts player. It will lose!

$ py.test test_player.py -s -k test_trained This will test a trained neural network against a simplemcts player. The model is very strong. It will win easily.

$ py.test test_player.py -s -k test_puct_v2

This will test two gzero players against each other. They use a reasonable strong network, but one uses the puct1 player and the other is puct2 player. puct1 player is what is used for self play, and puct2 is used for match mode and has many more (experimental) features - not least batching on the GPU, so is much faster.

Note, I aim to merge puct1/puct2 at some point.

Let me know how that goes please, and I will work on some decent instructions for training. Feel free to play around with options etc. :) PS a number of unit tests have rotted over time. I will also aim to fix those.

richemslie avatar May 19 '19 15:05 richemslie