fishtest icon indicating copy to clipboard operation
fishtest copied to clipboard

Add EGTB support to fishtest

Open protonspring opened this issue 4 years ago • 41 comments

If we add a directory with EGTB to the stockfish repository. When stockfish is executed, it can't find the EGTB files.

This seems most likely due to fishtest not executing the stockfish binary in the same folder where it was built.

protonspring avatar Mar 06 '20 18:03 protonspring

Fishtest deletes the temporary folder after building stockfish, so it required a PR

https://github.com/glinscott/fishtest/blob/d72251e6859a732a9662c74fbb5bcbae3c05babb/worker/games.py#L143-L145

EDIT_000: there are two different ways.

  • EGTB files don't change in different tests: download them only once in the "testing" folder from the books repo
  • EGTB files change in different tests: keep the EGTB files in the Stockfish repo and move them in the "testing" folder from the temporary building folder.

ppigazzini avatar Mar 06 '20 20:03 ppigazzini

The EGTB files should almost never change. Should i add them to the books repo and do a PR?

protonspring avatar Mar 06 '20 21:03 protonspring

I think that's a bit quick honestly. Things to consider:

  • Playing with EGTB on HDD instead of SSD is a loss of Elo... certainly a good way to introduce more variance.
  • Do we really want to remove EG knowledge (phones, browsers, etc, might not like to have EGTB)
  • Fishtest users might not want to download/store X GB of data

vondele avatar Mar 06 '20 21:03 vondele

In this SF branch the EGTB files were added with the message "No endgames stuff" and the size is very low: I thought that the EGTB were used to test some EG ideas.

ppigazzini avatar Mar 06 '20 21:03 ppigazzini

I agree with @vondele.

The test was meant to demonstrate that we might drop all the endgame stuff and bundle Stockfish with the most basic EGTB files instead. This would also work for phones, tablets, etc. since the size of the egtbs folder is very small.

Unfortunately, the way I tried to test this is not supported by fishtest. So the basic question seems to be if there is a simple way to tweak fishtest to make this work? For now it looks like the answer is "No" ...

joergoster avatar Mar 06 '20 22:03 joergoster

The set of files here was only 3MB. If that size alone is better than engames.cpp, then isn't that a different scenario?

protonspring avatar Mar 06 '20 22:03 protonspring

i don't think it will be hard at all to fix fishtest to support this test.

protonspring avatar Mar 06 '20 23:03 protonspring

@protonspring dirty fix already running on DEV server. Please mind that a N cores worker will access N times the HD for the EGTB with a slowdown..

ppigazzini avatar Mar 06 '20 23:03 ppigazzini

i would expect the drive would cache those little files. if you are very worried about it, I might also just suggest to set up a ram disk and put the EGTP files on that.

protonspring avatar Mar 07 '20 01:03 protonspring

@ppigazzini Awesome!

joergoster avatar Mar 07 '20 04:03 joergoster

EDIT_001: data update with endgames.epd book EDIT_000: data update with STC test

ELO: -2.22 +-3.1 (95%) LOS: 7.8%
Total: 20000 W: 4215 L: 4343 D: 11442
Ptnml(0-2): 401, 2504, 4359, 2294, 442

chi^2: 10.12
dof: 10
p-value: 43.05%
ELO: -2.97 +-2.9 (95%) LOS: 2.2%
Total: 20000 W: 3694 L: 3865 D: 12441
Ptnml(0-2): 326, 2384, 4764, 2187, 339

chi^2: 3.46
dof: 6
p-value: 74.87%
ELO: -11.70 +-2.1 (95%) LOS: 0.0%
Total: 20000 W: 3649 L: 4322 D: 12029
Ptnml(0-2): 13, 2189, 6266, 1522, 10

chi^2: 3.93
dof: 4
p-value: 41.57%

Workers started according to this wiki section using this repo/branch.

Here the quick and dirty patch (not production ready):

$ git diff master
diff --git a/worker/games.py b/worker/games.py
index e2a64e1..5c02d7a 100644
--- a/worker/games.py
+++ b/worker/games.py
@@ -138,6 +138,11 @@ def setup_engine(destination, worker_dir, sha, repo_url, concurrency):


     shutil.move('stockfish'+ EXE_SUFFIX, destination)
+
+    testing_dir=os.path.join(worker_dir, 'testing')
+    if not os.path.exists(os.path.join(testing_dir, 'egtbs')):
+      shutil.move('egtbs', testing_dir)
+
   except:
     raise Exception('Failed to setup engine for %s' % (sha))
   finally:

ppigazzini avatar Mar 07 '20 08:03 ppigazzini

@ppigazzini Thank you very much!

A small elo loss was to be expected, of course! I tried to keep the size of the egtbs folder really small for the first test. And not everybody is running from an SSD.

Getting rid of >1,200 lines of code at the cost of 2-6 elo seems definitely worth a consideration, imho. @vondele What's your opinion? Most users will test or analyze with a full set of 5- or 6-man syzygy's anyways! At minimum ... Although, some rating list testers still test without EGTBs for engines.

OTOH, it is fascinating to see that even this handcoded endgame knowledge, partially incomplete and even causing some eval discontinuities, is worth something!

joergoster avatar Mar 07 '20 10:03 joergoster

@joergoster first, I think these are useful experiments to do. However, my feeling is that there is value in the 'human coded' endgame knowledge. Not only Elo, but also some chess knowledge condensed. In fact I wish we could extend it more (e.g. the KRPvKBP or KRPsvKRPs). So replacing ~1000 lines of code with ~>1MB of data seem no win to me. That's in line with your last remark, that it is fascinating that such incomplete knowledge has value. Note that even the 150GB 6men TB (and about 1700 lines of code) seems only worth about 10-20Elo depending on TC (https://github.com/glinscott/fishtest/wiki/UsefulData#elo-gain-using-syzygy).

I really don't think that 'most users' will analyze with 5/6men TB. The engine enthusiasts likely, but most users are elsewhere, downloading an app on the phone, or using the webasm version via lichess.

vondele avatar Mar 07 '20 11:03 vondele

as a PS, maybe this would be an interesting test to run against the endgames.epd book.

vondele avatar Mar 07 '20 11:03 vondele

i would be interested in adding only the necessary tables to replace some specific endgames.

I have less experience than you folks, but KRPKR or KBPKR seem quite challenging. Perhaps just a few if the tough ones and consider them individually.

I am still in favor of adding this minimalist set (3 MB) without removing endgame code and changing fish test. This way we can do more testing on individual tables and add as needed or where a particular book proves to be helpful (without being too big).

protonspring avatar Mar 07 '20 15:03 protonspring

At the very least, fix the directory issue so that a fishtest user could add an egtb to their code branch and test it on the framework.

protonspring avatar Mar 07 '20 16:03 protonspring

@protonspring the quick fix for a fast test was easy (view the update of my previous post), a production grade fix should require additional work.

  • If the EGTB files are chosen by the maintainer then the proper (and easier to add in fishtest) way is to download them only once from the "books" repo (like the books files and the cutechess-cli binaries).
  • If the EGTB files are chosen by the developer and deployed to fishtest through the developer SF repo, we need to make a few changes on fishtest code: move each "stockfish_" engine and "egtbs" in a "stockfish_" subfolder, change the binaries cache code, change the update code, check the cutechess-cli command etc. etc.

By the way I lack SF know-how e.g. I don't know if SF at default behavior looks for the "egtbs" folder.

ppigazzini avatar Mar 08 '20 11:03 ppigazzini

By the way I lack SF know-how e.g. I don't know if SF at default behavior looks for the "egtbs" folder.

By default SF doesn't look for EGTB files.

joergoster avatar Mar 08 '20 11:03 joergoster

By default SF doesn't look for EGTB files.

OK, so this change (and how to implement) is in the @vondele hands :)

ppigazzini avatar Mar 08 '20 12:03 ppigazzini

as a PS, maybe this would be an interesting test to run against the endgames.epd book.

I started a new fixed test using the endgames.epd book.

To everybody: feel free to submit tests on DEV (use your credential or "user01/user01" to submit and "user00/user00" to approve), please mind that the DEV db is synchronized from time to time with the PROD db so the results don't last: copy the test result not the link.

To join some workers:

#!/bin/bash
dir_num=${1:-'00'}
usr_pwd=${2:-'user01'}
test_folder=${HOME}/_git/__test_folder${dir_num}
virtual_env=${test_folder}/fishtest/worker/env
fish_host=dfts-0.pigazzini.it

rm -rf ${test_folder}
mkdir -p ${test_folder}
cd ${test_folder}

git clone --single-branch --branch master https://github.com/glinscott/fishtest.git
cd ${test_folder}/fishtest
git config user.email "[email protected]"
git config user.name "your_name"

# add here the upstream branch to be tested
git remote add new-upstream https://github.com/ppigazzini/fishtest
git pull --no-edit new-upstream egtb_test

# add here the PRs to be tested
#git pull --no-edit origin pull/539/head

cd ${test_folder}/fishtest/worker
arch_cpu=x86-64
if [ "$(g++ -Q -march=native --help=target | grep mpopcnt | grep enabled)" ] ; then
arch_cpu=x86-64-modern
elif [ "$(g++ -Q -march=native --help=target | grep mbmi2 | grep enabled)" ] ; then
arch_cpu=x86-64-bmi2
fi
echo "CXXFLAGS='-march=native' make profile-build -j ARCH=${arch_cpu} COMP=gcc" > custom_make.txt

python3 -m venv ${virtual_env}
${virtual_env}/bin/pip install --upgrade pip setuptools wheel
${virtual_env}/bin/pip install requests

${virtual_env}/bin/python3 worker.py --host ${fish_host} ${usr_pwd} ${usr_pwd} --concurrency 3

ppigazzini avatar Mar 08 '20 12:03 ppigazzini

@ppigazzini endgames.epd gives a much more sensitive result! Is there some info about what books are available to choose from?

joergoster avatar Mar 08 '20 13:03 joergoster

Is there some info about what books are available to choose from?

This is the books repo, perhaps @vondele could be able to help with some info about the books https://github.com/official-stockfish/books

ppigazzini avatar Mar 08 '20 14:03 ppigazzini

Thanks. I wasn't aware of this special repo.

joergoster avatar Mar 08 '20 14:03 joergoster

If fishtest executed stockfish from the build directory, it would allow a developer to test whatever they wanted.

What is the reasoning for changing the working directory?

protonspring avatar Mar 08 '20 14:03 protonspring

@vondele Going one step further and stripping off scale factor downscaling, too, reveals a big elo loss. See https://tests.stockfishchess.org/tests/view/5e64d119e42a5c3b3ca2e2af

However, in a quick local test, 5-man syzygy bases almost equalized the loss!

Finished game 400 (SF-NoEG2 vs SF-5-man): 1/2-1/2 {Draw by adjudication} Score of SF-5-man vs SF-NoEG2: 75 - 43 - 282 [0.540] 400 Elo difference: 27.9 +/- 18.4, LOS: 99.8 %, DrawRatio: 70.5 % Finished match

I'm now running the same test with 6-man files.

joergoster avatar Mar 08 '20 14:03 joergoster

If fishtest executed stockfish from the build directory, it would allow a developer to test whatever they wanted.

What is the reasoning for changing the working directory?

Keep the fishtest code simple, have short path names, have a low hd space requirement for CPU contributors etc. etc.

ppigazzini avatar Mar 08 '20 15:03 ppigazzini

Sry for the dumb questions, But wouldn't it be much more simple to just run the executable in the build directory?

protonspring avatar Mar 08 '20 15:03 protonspring

Sry for the dumb questions, But wouldn't it be much more simple to just run the executable in the build directory?

The original implementation moved the binaries in the "testing" folder, so to run the binary in the building folder requires some refactoring (building, updating, binaries cache, cutechess-cli command etc.).

Keep in mind that in the early day the windows workers were not able to build the binaries and simply downloaded the binaries form a binaries builder.

ppigazzini avatar Mar 08 '20 16:03 ppigazzini

@protonspring fishtest was coded in hurry to support Stockfish, can be surely improved. Feel free to join the fishtest developers community :)

ppigazzini avatar Mar 08 '20 16:03 ppigazzini

Two questions:

  1. Are all workers building now?
  2. Are the branches to compare ( master vs patch) built in different directories?

protonspring avatar Mar 08 '20 19:03 protonspring