CrazyAra High memory usage (original: Tree doesn't grow beyond 6 million Nodes)

Hi Johannes, First of all, congratulations on your latest release! This engine plays remarkably different from any others I have seen, and it seems to do so at a very high level. I noticed an issue in infinite analysis. Once the engine hit somewhere between 5 and 6 million nodes it slowed down quite dramatically and didn't grow much after 6.3 million. Is this issue due to RAM limit?

My PC:

i7-4790k
NVIDIA RTX 2070 Super
8 GB DDR3 RAM.

Thank you for your wonderful contribution to chess! Let me know how I can reach you out for a potential collaboration. Tanmay Srinath.

Aug 29 '21 11:08 magicianfromriga

Hello @magicianofriga , nice to hear that you like this project. Yes, it is possible allocate more than 6 million nodes and you are right with the assumption that you run out of memory on your machine.

I just tested it on my desktop machine using the ClassicAra 0.9.5 executable. About 1 GiB is allocated on startup, when loading the CUDA, cuDNN and TensorRT library. The remaining memory is allocated dynamically overt time.

$ ./CrazyAra_ClassicAra_MultiAra_0.9.5_Linux_TensorRT/ClassicAra
isready
position startpos
go infinite
...
info depth 41 seldepth 61 multipv 1 score cp 31 nodes 17000024 nps 14257 tbhits 0 time 1192437 pv d2d4 d7d5 c2c4 c7c6 b1c3 g8f6 c4d5 c6d5 g1f3 b8c6 c1f4 a7a6 e2e3 c8g4 h2h3 g4f3 d1f3 e7e6 f1d3 c6b4 d3b1 a8c8 e1g1 f8d6 f4g5 h7h6 g5h4 e8g8 f3e2 d6e7 f2f4 b4c6 g2g4 f6e8 h4g3 e8d6 f4f5 e7h4 g3h2 f8e8 f5e6
info string rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
bestmove d2d4
info string apply move to tree
root
  #  | Move  |    Visits    |  Policy   |  Q-values  |  CP   |    Type    
-----+-------+--------------+-----------+------------+-------+------------
 001 |    d4 |     15916915 | 0.1981373 |  0.0998159 |    31 |  UNSOLVED
 000 |    e4 |       531758 | 0.6484740 |  0.0529837 |    16 |  UNSOLVED
 002 |   Nf3 |       106669 | 0.0669805 |  0.0752128 |    23 |  UNSOLVED
 003 |    c4 |        51625 | 0.0401248 |  0.0676878 |    20 |  UNSOLVED
 010 |    c3 |        26822 | 0.0014303 | -0.0103028 |    -3 |  UNSOLVED
 019 |    f3 |        26821 | 0.0000890 | -0.0903614 |   -28 |  UNSOLVED
 008 |    e3 |        26730 | 0.0041160 |  0.0041643 |     1 |  UNSOLVED
 009 |    d3 |        26723 | 0.0017874 | -0.0385385 |   -11 |  UNSOLVED
 005 |    g3 |        26697 | 0.0102718 |  0.0302087 |     9 |  UNSOLVED
 007 |    b3 |        26654 | 0.0052031 | -0.0252461 |    -7 |  UNSOLVED
 013 |    h4 |        26618 | 0.0006316 | -0.0421854 |   -12 |  UNSOLVED
 016 |   Na3 |        26590 | 0.0003413 | -0.0799976 |   -24 |  UNSOLVED
 006 |   Nc3 |        26544 | 0.0073106 |  0.0086331 |     2 |  UNSOLVED
 018 |    h3 |        26524 | 0.0000947 | -0.0251692 |    -7 |  UNSOLVED
 017 |   Nh3 |        26500 | 0.0002794 | -0.0697391 |   -21 |  UNSOLVED
 011 |    b4 |        26486 | 0.0011651 | -0.0404012 |   -12 |  UNSOLVED
 004 |    f4 |        26422 | 0.0116823 | -0.0581965 |   -17 |  UNSOLVED
 014 |    g4 |        26377 | 0.0005855 | -0.1250571 |   -39 |  UNSOLVED
 012 |    a3 |        26321 | 0.0008925 | -0.0210689 |    -6 |  UNSOLVED
 015 |    a4 |        26233 | 0.0004028 | -0.0606963 |   -18 |  UNSOLVED
-----+-------+--------------+-----------+------------+-------+------------
initial value:	0.0941720
nodeType:	UNSOLVED
isTerminal:	0
isTablebase:	0
unsolvedNodes:	20
Visits:		17032028
freeVisits:	32004/17032028

In this case, it allocated 17.1 GiB for 17 million nodes. That is a lot honestly.

Previously, we had the problem that one was only able to allocate up to 16.7 million nodes because of reaching a variable overflow (https://github.com/QueensGambit/CrazyAra/issues/39). This has been fixed by changing the data type from float to uint32_t for the visits variable. Now, you are in principle able to run 4.2 billion (2^32 = 4,294,967,295 ) simulations.

Reducing the memory consumption, is on the TODO list for future versions. In the meantime you can use the UCI option Nodes_Limit to limit the number of nodes you wish to allocate when using the go infinite command or during engine tournament play.

Let me know how I can reach you out for a potential collaboration.

I'm not sure if you are interested in collaborating via coding. If so, you can follow the build instructions in the wiki pages. If you have a linux system, you can use the update.sh shell script which is the script that was used to install ClassicAra on TCEC. Additionally, if you want to setup all dependencies to start reinforcement learning, you can make use of the dockerfile.

If you want, you can start working on the memory issue, e.g. one approach would be to replace the vectors in the NodeData class by a struct of singular values. https://github.com/QueensGambit/CrazyAra/blob/0f3d60f48fa914209664d74a9eba329c6fc4b54c/engine/src/nodedata.h#L90 However, I'm afraid that working on this problem is not ideal for newcomers to the project, as it requires a lot of refactoring and a good understanding of the code base.

You can also write me a mail using the mail adress given on my profile page.

Aug 29 '21 17:08 QueensGambit

Hi Johannes, Thanks for the prompt response! With regards to coding, I am average at best. I was looking at contributing positions where Classic Ara seems to struggle compared to other top engines in the world i.e. add my chess expertise to the project. Let me know if that will be of use to you.

I will take a serious look at helping with reinforcement learning, though right now my focus is only on the Classic Ara part of your project as that holds maximum relevance to what I am doing (high level correspondence chess).

Best wishes, Tanmay Srinath.

Aug 30 '21 13:08 magicianfromriga

Sure, creating a new issue which summarizes problematic positions for ClassicAra can help fixing search problems. You can follow a similar structure as in:

https://github.com/LeelaChessZero/lc0/issues/164

One common way to analyze a chess engine strength and weaknesses is to use test suites such as the Eigenmann Rapid Engine Test (ERET-Test).

https://www.chessprogramming.org/Test-Positions
https://www.chessprogramming.org/Eigenmann_Rapid_Engine_Test

However, test suites cannot fully replace traditional Elo testing.

Aug 31 '21 13:08 QueensGambit

Hi Johannes! I would like to start training nets for ClassicAra. Can you point me to resources that would allow me to set up a training pipeline? Thanks!

Mar 16 '22 08:03 magicianfromriga

Hello again @magicianofriga ! One way to start is to use the setup with the nvidia docker container.

https://github.com/QueensGambit/CrazyAra/tree/master/DeepCrazyhouse/src/training#start-training-from-a-docker-container

However, this requires using linux.

Alternatively, you may install the dependencies from the requirements.txt file.

https://github.com/QueensGambit/CrazyAra/blob/master/DeepCrazyhouse/src/training/requirements.txt

If you want to use supervised learning, you need to create a data set from pgn files first.

https://github.com/QueensGambit/CrazyAra/tree/master/DeepCrazyhouse/src/preprocessing/download_pgns
https://github.com/QueensGambit/CrazyAra/blob/master/DeepCrazyhouse/src/preprocessing/convert_pgn_to_planes.ipynb

The configuration files can be found here:

https://github.com/QueensGambit/CrazyAra/tree/master/DeepCrazyhouse/configs

You need to rename main_config_template.py into main_config.py for it to work.

An exemplary data set for crazyhouse can be found downloaded here:

https://github.com/QueensGambit/CrazyAra/wiki/Stockfish-10:-Crazyhouse-Self-Play

The current training is done in MXNet. However, a next step of this project is to add pytorch training support. If you are familiar with pytroch and like coding you can start working on setting up a pytorch training loop. Next, you can add a PR if you like.

The class TrainerAgent could be converted into an abstract class and inherited by a new TrainerAgentPytorch class:

https://github.com/QueensGambit/CrazyAra/blob/master/DeepCrazyhouse/src/training/trainer_agent.py

The current reporistory can be used as reference:

https://gitlab.com/jweil/PommerLearn/-/tree/master/pommerlearn

ONNX is used as the main network format right now to allow a flexible exchange between different DL-frameworks. You can use Netron to investigate the neural network architecture:

https://github.com/lutzroeder/netron

Mar 16 '22 18:03 QueensGambit

Thanks for the response! Will WSL (Windows Subsystem for Linux) work instead?

Mar 18 '22 13:03 magicianfromriga

Sadly no, but I may be wrong about this.

Mar 19 '22 15:03 QueensGambit

Hi Johannes, Would it be fine if I shared the PGN for training with you? Since I don't have Linux at the moment I am not sure how I can contribute to the training process. What would be the email ID to which I share? Thanks, Tanmay Srinath.

Jun 09 '22 04:06 magicianfromriga

Hello Tanmay Srinath, you can use my email as shown on https://www.aiml.informatik.tu-darmstadt.de/people/jczech/ I'm currently also in the process of adding Pytorch as a new neural network framework back-end in order to start a RL run for classical chess.

Jun 09 '22 10:06 QueensGambit

Thanks! If it's a PyTorch trainer, perhaps I can use it on Windows as well. Also, I would love to write a script that automates compilation on Windows. How do you compile your releases on Windows? Can you share your existing script with me so that I can try to generalise it? Tanmay Srinath.

Jun 10 '22 03:06 magicianfromriga

Hi, you can find the compilation script for linux of release 1.0.0 here:

https://github.com/QueensGambit/CrazyAra/releases/download/1.0.0/update.sh

I added instructions for building CrazyAra on Windows in the wiki:

https://github.com/QueensGambit/CrazyAra/wiki/3.-Build-CrazyAra-binary

Currently, I'm building the binaries for Windows manually according to the wiki-pages. This usually involves updating the Cuda libraries as well.

Jun 10 '22 13:06 QueensGambit

CrazyAra CrazyAra copied to clipboard

High memory usage (original: Tree doesn't grow beyond 6 million Nodes)

CrazyAra
CrazyAra copied to clipboard