AlphaZero.jl icon indicating copy to clipboard operation
AlphaZero.jl copied to clipboard

Fpu implementation

Open Whojo opened this issue 2 years ago • 2 comments

Adds FPU ("First Play Urgency") inspired by Leela-Chess:

  • 2 FPU strategies
    • reduction: subtract the fpu_value to the parent's value
    • absolute: directly replace new nodes value with fpu_value

Note: FPU is not applied yet at the root node, as in game of src/mcts.jl:explore! for example

Feel free to propose any modifications.

Whojo avatar Apr 27 '22 23:04 Whojo

This looks great. I will merge after we run some experiments on Connect Four. :-)

jonathan-laurent avatar May 04 '22 23:05 jonathan-laurent

based on lc0, relative fpu that is fairly negative is probably a good default.

oscardssmith avatar May 04 '22 23:05 oscardssmith