open_spiel issues

Does alphazero support reuse-tree?

4

Does alphazero support reuse-tree?

Nightbringers

question

Q-learning is a loser?

6

Hello, I'm trying to create a strong Mancala bot. I chose Q-learning: `# Let's do independent Q-learning in Mancala, and play it against random. # RL is based on python/examples/independent_tabular_qlearning.py...

StepHaze

Updating Alpha Zero

9

This is a feature proposal. While basing my work on this version of Alpha Zero (Tensorflow, both in Python and C++), I have many points that I have addressed, including:...

ramizouari

contribution welcome

Add: implementation MF-PPO

4

MF-PPO algorithm implemented in the paper: ``` @inproceedings{algumaei2023regularization, title={Regularization of the policy updates for stabilizing Mean Field Games}, author={Algumaei, Talal and Solozabal, Ruben and Alami, Reda and Hacid, Hakim and...

rubensolozabal

Information Tensor for Universal Poker/ACPC is abstracted even when the game is fullgame

11

I noticed that the information tensor for Universal Poker works when the game is abstracted, but when the game is unabstracted (i.e. full game) the information tensor is still abstracted,...

VitamintK

bug

Added JAX implementation of CFR

3

Implementation of CFR that uses JAX. This allows running CFR with GPU acceleration. Speed up over python CFR implementation is ~10-times on CPU only. The goal was to make it...

kubicon

RNaD off policy case

4

In the example for RNaD, the importance sampling correction for get_loss_nerd is 1. This is because the example provided is the on-policy case, and there are synchronous updates of the...

spktrm

Added Python efr algorithm implementation

6

A Python implementation of the EFR (https://arxiv.org/abs/2102.06973) algorithm with the deviation types defined in the proposing paper. The implementation was developed as part of my undergraduate dissertation and I thought...

Jamesflynn1

About how to load subgame of Libratus

2

Hi, I am currently working on experiments for my algorithm and intend to test it on the subgame of Libratus mentioned in [^1]. I have noticed that the **universal_poker** has...

Atan03

Quoridor Movement Action IDs keep changing

16

So, I'm playing quoridor, and I was trying to figure out which action IDs corresponded to moving the agent. Therefore, after I placed all the walls, I went to look...

aadharna

help wanted

contribution welcome

open_spiel
open_spiel copied to clipboard

Metadata

Does alphazero support reuse-tree?

Q-learning is a loser?

Updating Alpha Zero

Add: implementation MF-PPO

Information Tensor for Universal Poker/ACPC is abstracted even when the game is fullgame

Added JAX implementation of CFR

RNaD off policy case

Added Python efr algorithm implementation

About how to load subgame of Libratus

Quoridor Movement Action IDs keep changing

← Metadata

Owner

Metadata

open_spiel open_spiel copied to clipboard

Metadata

← Metadata

Owner

Metadata

open_spiel
open_spiel copied to clipboard