RoyalUr-Analysis
RoyalUr-Analysis copied to clipboard
Theoretically Optimal Play
I am making an effort to use policy iteration for achieving optimal play under the Finkel ruleset. I have ran the search algorithm and there is ~137M non-end board states. With some computing power it should be feasible to run many iterations of the algorithm and achieve convergence to the optimal strategy.
I am currently bottlenecked by the RAM required to produce the full game graph necessary for running the algorithm but I expect to have access to some high-performance computing capacity in the near future. You can find my plans in planning.md
in my repo.
I am wondering if anyone else have attempted a similar approach and if there is any pitfall I should watch out for. I suppose you can say this is just expectimax on steroids given the unlimited backup depth. Any critique is appreciated!