fishtest
fishtest copied to clipboard
Add EGTB support to fishtest
If we add a directory with EGTB to the stockfish repository. When stockfish is executed, it can't find the EGTB files.
This seems most likely due to fishtest not executing the stockfish binary in the same folder where it was built.
Fishtest deletes the temporary folder after building stockfish, so it required a PR
https://github.com/glinscott/fishtest/blob/d72251e6859a732a9662c74fbb5bcbae3c05babb/worker/games.py#L143-L145
EDIT_000: there are two different ways.
- EGTB files don't change in different tests: download them only once in the "testing" folder from the books repo
- EGTB files change in different tests: keep the EGTB files in the Stockfish repo and move them in the "testing" folder from the temporary building folder.
The EGTB files should almost never change. Should i add them to the books repo and do a PR?
I think that's a bit quick honestly. Things to consider:
- Playing with EGTB on HDD instead of SSD is a loss of Elo... certainly a good way to introduce more variance.
- Do we really want to remove EG knowledge (phones, browsers, etc, might not like to have EGTB)
- Fishtest users might not want to download/store X GB of data
In this SF branch the EGTB files were added with the message "No endgames stuff" and the size is very low: I thought that the EGTB were used to test some EG ideas.
I agree with @vondele.
The test was meant to demonstrate that we might drop all the endgame stuff and bundle Stockfish with the most basic EGTB files instead. This would also work for phones, tablets, etc. since the size of the egtbs folder is very small.
Unfortunately, the way I tried to test this is not supported by fishtest. So the basic question seems to be if there is a simple way to tweak fishtest to make this work? For now it looks like the answer is "No" ...
The set of files here was only 3MB. If that size alone is better than engames.cpp, then isn't that a different scenario?
i don't think it will be hard at all to fix fishtest to support this test.
@protonspring dirty fix already running on DEV server. Please mind that a N cores worker will access N times the HD for the EGTB with a slowdown..
i would expect the drive would cache those little files. if you are very worried about it, I might also just suggest to set up a ram disk and put the EGTP files on that.
@ppigazzini Awesome!
EDIT_001: data update with endgames.epd book EDIT_000: data update with STC test
ELO: -2.22 +-3.1 (95%) LOS: 7.8%
Total: 20000 W: 4215 L: 4343 D: 11442
Ptnml(0-2): 401, 2504, 4359, 2294, 442
chi^2: 10.12
dof: 10
p-value: 43.05%
ELO: -2.97 +-2.9 (95%) LOS: 2.2%
Total: 20000 W: 3694 L: 3865 D: 12441
Ptnml(0-2): 326, 2384, 4764, 2187, 339
chi^2: 3.46
dof: 6
p-value: 74.87%
ELO: -11.70 +-2.1 (95%) LOS: 0.0%
Total: 20000 W: 3649 L: 4322 D: 12029
Ptnml(0-2): 13, 2189, 6266, 1522, 10
chi^2: 3.93
dof: 4
p-value: 41.57%
Workers started according to this wiki section using this repo/branch.
Here the quick and dirty patch (not production ready):
$ git diff master
diff --git a/worker/games.py b/worker/games.py
index e2a64e1..5c02d7a 100644
--- a/worker/games.py
+++ b/worker/games.py
@@ -138,6 +138,11 @@ def setup_engine(destination, worker_dir, sha, repo_url, concurrency):
shutil.move('stockfish'+ EXE_SUFFIX, destination)
+
+ testing_dir=os.path.join(worker_dir, 'testing')
+ if not os.path.exists(os.path.join(testing_dir, 'egtbs')):
+ shutil.move('egtbs', testing_dir)
+
except:
raise Exception('Failed to setup engine for %s' % (sha))
finally:
@ppigazzini Thank you very much!
A small elo loss was to be expected, of course! I tried to keep the size of the egtbs folder really small for the first test. And not everybody is running from an SSD.
Getting rid of >1,200 lines of code at the cost of 2-6 elo seems definitely worth a consideration, imho. @vondele What's your opinion? Most users will test or analyze with a full set of 5- or 6-man syzygy's anyways! At minimum ... Although, some rating list testers still test without EGTBs for engines.
OTOH, it is fascinating to see that even this handcoded endgame knowledge, partially incomplete and even causing some eval discontinuities, is worth something!
@joergoster first, I think these are useful experiments to do. However, my feeling is that there is value in the 'human coded' endgame knowledge. Not only Elo, but also some chess knowledge condensed. In fact I wish we could extend it more (e.g. the KRPvKBP or KRPsvKRPs). So replacing ~1000 lines of code with ~>1MB of data seem no win to me. That's in line with your last remark, that it is fascinating that such incomplete knowledge has value. Note that even the 150GB 6men TB (and about 1700 lines of code) seems only worth about 10-20Elo depending on TC (https://github.com/glinscott/fishtest/wiki/UsefulData#elo-gain-using-syzygy).
I really don't think that 'most users' will analyze with 5/6men TB. The engine enthusiasts likely, but most users are elsewhere, downloading an app on the phone, or using the webasm version via lichess.
as a PS, maybe this would be an interesting test to run against the endgames.epd book.
i would be interested in adding only the necessary tables to replace some specific endgames.
I have less experience than you folks, but KRPKR or KBPKR seem quite challenging. Perhaps just a few if the tough ones and consider them individually.
I am still in favor of adding this minimalist set (3 MB) without removing endgame code and changing fish test. This way we can do more testing on individual tables and add as needed or where a particular book proves to be helpful (without being too big).
At the very least, fix the directory issue so that a fishtest user could add an egtb to their code branch and test it on the framework.
@protonspring the quick fix for a fast test was easy (view the update of my previous post), a production grade fix should require additional work.
- If the EGTB files are chosen by the maintainer then the proper (and easier to add in fishtest) way is to download them only once from the "books" repo (like the books files and the cutechess-cli binaries).
- If the EGTB files are chosen by the developer and deployed to fishtest through the developer SF repo, we need to make a few changes on fishtest code: move each "stockfish_
" engine and "egtbs" in a "stockfish_ " subfolder, change the binaries cache code, change the update code, check the cutechess-cli command etc. etc.
By the way I lack SF know-how e.g. I don't know if SF at default behavior looks for the "egtbs" folder.
By the way I lack SF know-how e.g. I don't know if SF at default behavior looks for the "egtbs" folder.
By default SF doesn't look for EGTB files.
By default SF doesn't look for EGTB files.
OK, so this change (and how to implement) is in the @vondele hands :)
as a PS, maybe this would be an interesting test to run against the endgames.epd book.
I started a new fixed test using the endgames.epd book.
To everybody: feel free to submit tests on DEV (use your credential or "user01/user01" to submit and "user00/user00" to approve), please mind that the DEV db is synchronized from time to time with the PROD db so the results don't last: copy the test result not the link.
To join some workers:
#!/bin/bash
dir_num=${1:-'00'}
usr_pwd=${2:-'user01'}
test_folder=${HOME}/_git/__test_folder${dir_num}
virtual_env=${test_folder}/fishtest/worker/env
fish_host=dfts-0.pigazzini.it
rm -rf ${test_folder}
mkdir -p ${test_folder}
cd ${test_folder}
git clone --single-branch --branch master https://github.com/glinscott/fishtest.git
cd ${test_folder}/fishtest
git config user.email "[email protected]"
git config user.name "your_name"
# add here the upstream branch to be tested
git remote add new-upstream https://github.com/ppigazzini/fishtest
git pull --no-edit new-upstream egtb_test
# add here the PRs to be tested
#git pull --no-edit origin pull/539/head
cd ${test_folder}/fishtest/worker
arch_cpu=x86-64
if [ "$(g++ -Q -march=native --help=target | grep mpopcnt | grep enabled)" ] ; then
arch_cpu=x86-64-modern
elif [ "$(g++ -Q -march=native --help=target | grep mbmi2 | grep enabled)" ] ; then
arch_cpu=x86-64-bmi2
fi
echo "CXXFLAGS='-march=native' make profile-build -j ARCH=${arch_cpu} COMP=gcc" > custom_make.txt
python3 -m venv ${virtual_env}
${virtual_env}/bin/pip install --upgrade pip setuptools wheel
${virtual_env}/bin/pip install requests
${virtual_env}/bin/python3 worker.py --host ${fish_host} ${usr_pwd} ${usr_pwd} --concurrency 3
@ppigazzini endgames.epd gives a much more sensitive result! Is there some info about what books are available to choose from?
Is there some info about what books are available to choose from?
This is the books repo, perhaps @vondele could be able to help with some info about the books https://github.com/official-stockfish/books
Thanks. I wasn't aware of this special repo.
If fishtest executed stockfish from the build directory, it would allow a developer to test whatever they wanted.
What is the reasoning for changing the working directory?
@vondele Going one step further and stripping off scale factor downscaling, too, reveals a big elo loss. See https://tests.stockfishchess.org/tests/view/5e64d119e42a5c3b3ca2e2af
However, in a quick local test, 5-man syzygy bases almost equalized the loss!
Finished game 400 (SF-NoEG2 vs SF-5-man): 1/2-1/2 {Draw by adjudication} Score of SF-5-man vs SF-NoEG2: 75 - 43 - 282 [0.540] 400 Elo difference: 27.9 +/- 18.4, LOS: 99.8 %, DrawRatio: 70.5 % Finished match
I'm now running the same test with 6-man files.
If fishtest executed stockfish from the build directory, it would allow a developer to test whatever they wanted.
What is the reasoning for changing the working directory?
Keep the fishtest code simple, have short path names, have a low hd space requirement for CPU contributors etc. etc.
Sry for the dumb questions, But wouldn't it be much more simple to just run the executable in the build directory?
Sry for the dumb questions, But wouldn't it be much more simple to just run the executable in the build directory?
The original implementation moved the binaries in the "testing" folder, so to run the binary in the building folder requires some refactoring (building, updating, binaries cache, cutechess-cli command etc.).
Keep in mind that in the early day the windows workers were not able to build the binaries and simply downloaded the binaries form a binaries builder.
@protonspring fishtest was coded in hurry to support Stockfish, can be surely improved. Feel free to join the fishtest developers community :)
Two questions:
- Are all workers building now?
- Are the branches to compare ( master vs patch) built in different directories?