viridithas icon indicating copy to clipboard operation
viridithas copied to clipboard

Noisy evaluation

Open lucasart opened this issue 2 years ago • 3 comments

The idea is to make human playable levels.

The simplest solution is to limit depth (and possibly nodes, depending on how precisely a node limit can be heeded by the engine). The problem, especially with NN engines, is that the evaluation is insanely strong, so it feels like being completely dominated strategically, and eventually winning out of nowhere, because the computer blunders an obvious tactic.

The natural solution is to pollute the evaluation with random noise, in order to weaken tactical and positional play in a coherent manner.

Another benefit of randomness is, assuming random seeding, that the engine will choose different moves when presented with the same position several times. This is important if you want to use the computer as a sparring partner, otherwise you will be playing the same game, iteratively improving your mistake and eventually winning based on a memorized sequence. Randomness at the GUI level in book move selection is not enough, as people use system openings, forcing transposition.

I have used this scheme with good results in my engine. For level L:

  • depth=L
  • nodes=64<<L (not really necessary, more a backstop in case no time limit is specified)
  • logistic random drawing withscale=EndgamePawnValue *0.75^(L-1) added to each evaluation.
  • Threads= 1 (always disable SMP with levels because fixed depth strength increases with Threads as a result of shared hash table, also node count to depth increase with Threads adding more unpredictability)

There is also the Stockfish method, using multi PV, but it is rather bad in my experience. It will play really strong, and if you survive for long enough and start to take control of the position, at some point a singularity is reached and the engine collapses, giving you all its pieces with zero resistance.

PS: Reason for choosing logistic distribution is that it has fat tails and is easy to implement. Draw p from uniform(0,1) and calculate scale * log(p / (1 - p)).

lucasart avatar Oct 04 '23 00:10 lucasart

strikes me as a good idea. my initial instinct is to take a similar approach to eval randomisation as with draw score randomisation, where the random factor in evaluation is computed from the modulus of the nodecount at that point in the search, resulting in deterministic behaviour (good for finding bugs) - that said, you make a good case for the value of indeterminism, so i'll think about that too. if i find the time, i'll definitely implement your request.

cosmobobak avatar Oct 04 '23 14:10 cosmobobak

Indeterminism isn't necessarily incompatible with debugging. For example you can write info seed xxx to stdout after a go command is parsed, just before searching. Potentially a verbosity level to control it (eg. one info per completed depth is medium verbosity, on per iteration of aspiration search is high verbosity, things like that).

Node count modulus is easier to start with. But, you may run into another problem: counts are low and not covering the interval, especially at low depth.

lucasart avatar Oct 04 '23 23:10 lucasart

Here's a good fast and bloat free PRNG, with seed=state on 64-bit. Should be more than enough for this use case. https://xorshift.di.unimi.it/splitmix64.c

Note that uint64_t uses arithmetic modulo 2^64 in C, will panick in Rust by default, so you should explicitly tell the compiler that you want modular arithmetic

lucasart avatar Oct 05 '23 03:10 lucasart