nnue-pytorch
nnue-pytorch copied to clipboard
get rid of some nodchip code
https://github.com/glinscott/nnue-pytorch/blob/master/lib/nnue_training_data_formats.h contains PackedSfenValue, BitStream, and SfenPacker which are taken from the nodchip trainer. Rewrite it so that there are no licensing issues.
This might be a non-issue, but it's questionable. See http://talkchess.com/forum3/viewtopic.php?f=2&t=76736&p=885268#p885268
I've implemented PackedSfenValue format in c-chess-cli, to generate the training data. You might find this interesting.
It's engine agnostic, and can be used to sample game in any tournament condition you would otherwise run (same as cutechess-cli), not baked into Stockfish. You can specify sample frequency, and PV resolution (resolve tactics and record only non-check PV leave position).
What I'm missing are:
- unit tests: a few
PackedSfenValuerecords with their binary representation to validate the code. sfMoveandply(currently exported as zero).
That's cool!
One issue that I see is that we were using stockfish internal units for score, which inadvertedly means any generalization will break that, as cp is the only widely agreed format. I think it's good to try slowly move towards a more standardized representation, but it should be visibly documented for now :)
Since I want to be engine agnostic, I am using:
- cp for non mate scores: this is natural, and corresponds to what the UI reads when parsing
info ... score %dlines. INT16_MAX - X: for mate inXmoves.INT16_MIN + Y: for mated inYmoves.
The only thing that all engines seem to agree on is that int16_t is enough. Mate score, however, is where they all disagree. I think SF uses 32000-2*X+1 and -32000+2*Y, so there would some translation there (or perhaps you would just discard mate positions as they are not really useful for training).
While writing the code, I couldn't help but notice
- castling encoding is not Chess960 compatible, and that's easy to fix. if you take the castling rooks (of both colors), you will notice that they occupy at most 2 files. so you just need 2*3 extra bits to encode those files. which gives you 228+6=234, still well below 256 bits.
- if changing the format is possible (is it?), you may as well correct the way rule50 and fullMove are encoded, just 7 bits for rule50 and 16 for fullMove, the way it was meant to be (instead of the clumsy lower part followed by upper part that is currently implemented).
- I think it should be just possible to mark correct castling rights in the .bin format. Whether it's castling in normal chess or chess960 can be resolved based on the parser state. If you feel like trying to implement it in a backward compatibile way feel free, I don't really care about chess960 personally.
- The current implementation is the only backward compatibile fix that was possible to the issue of having only 6 bits for rule50 initially, see https://github.com/nodchip/Stockfish/pull/182/files