AlphaZero.jl
AlphaZero.jl copied to clipboard
Fpu implementation
Adds FPU ("First Play Urgency") inspired by Leela-Chess:
- 2 FPU strategies
- reduction: subtract the
fpu_value
to the parent's value - absolute: directly replace new nodes value with
fpu_value
- reduction: subtract the
Note:
FPU is not applied yet at the root node, as in game
of src/mcts.jl:explore!
for example
Feel free to propose any modifications.
This looks great. I will merge after we run some experiments on Connect Four. :-)
based on lc0, relative fpu that is fairly negative is probably a good default.